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]/ PR-US010482^ ^ PATENT 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
In re Application of 

Hilmer RAUHE et al. 
Serial No.: 09/937,727 
National Stage 

Filing Date: September 28, 2001 



For: INFORMATION-CARRYING 
AND INFORMATION- 
PROCESSING POLYMERS 



RESUBMISSION OF SEQUENCE LISTING 

Assistant Commissioner of Patents 
Washington, DC 20231 

Sir: 

In response to the April 18, 2002 USPTO communication, Applicants resubmit a 
copy of the Sequence Listing in computer readable form. I hereby state that the information 
recorded in computer readable form is identical to the written sequence listing and the 
information originally submitted in computer readable form. I hereby state that the copy of 
the Sequence Listing in computer readable form being submitted herewith contains no new 
matter. 

Respectfully submitted, 



Int'l Serial No.: PCT/EP00/02893 
Int'l Filing Date: March 31, 1999 



Todd M. Guise 
Reg. No. 46,748 

SHINJYU GLOBAL IP COUNSELORS, LLP 
1233 Twentieth Street, NW, Suite 700 
Washington, D.C. 20036 
(202)-293-0444 , 

Dated: g//7/b<£ 

G:\05-May02-MbM\P6-US0i0482 Sequence Listing 
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PR-US010482 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Application of 



Serial No.: 09/937,727 



Hilmer RAUHE et al 




Int'l Serial No.: PCT/EPOO/02893 



National Stage 

Filing Date: September 28, 2001 



Int'l Filing Date: March 31, 1999 



For: INFORMATION-CARRYING 
AND INFORMATION- 
PROCESSING POLYMERS 



SUPPLEMENTAL PRELIMINARY AMENDMENT 



Assistant Commissioner of Patents 
Washington, DC 20231 

Sir: 

Please amend the subject application as follows: 
TN THE SPECIFICATION : 

On page 1, line 3, before the section entitled DESCRIPTION, please insert the 



- This application is the National Stage of International Application Serial No. 
PCT/EPOO/02893, filed March 31, 1999 in German. - 

REMARKS 

Entrance of this Amendment is respectfully requested. Attached hereto is a marked 
version of the changes made to the specification by the current Amendment. The page 



following: 



o 3s» jl, y i 



Serial No.: 09/937,727 
Filed: September 28, 2001 
Page 2 of 3 



captioned "MARKED-UP VERSION OF AMENDMENTS" include markings to show the 
changes made by the current Amendment. 



Respectfully submitted, 



Todd M. Guise 
Reg. No. 46,748 
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MARKED-UP VERSION OF AMENDMENTS 

IN THE SPECIFICATION : 

On page 1, line 3, before the section entitled DESCRIPTION, the following was 
inserted: 

- This application is the National Stage of International Application Serial No. 
PCT/EP00/02893, filed March 31, 1999 in German. -- 
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PATENT 



PR-USO 10482 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Hilmer RAUHE et al. 

Serial No.: PCT/EPOO/02893 

Filed: March 31, 1999 

For: INFORMATION-CARRYING 
AND INFORMATION- 
PROCESSING POLYMERS 



Assistant Commissioner of Patents 
Washington, DC 20231 

Sir: 

Prior to examination of the above-identified application, please amend the subject 
application as follows: 

IN THE CLAIMS : 

Please replace claim 1 with the following amended version: 

1 . (Amended) A method for producing information-carrying polymers, 

comprising: 



Step I defining a regular grammar G = (£, V, R, S) with a finite terminal 
alphabet X, a finite set of variables V, a finite set of rules R, and a start 
symbol S; 

Step II being the NFR method (Niehaus-Feldkamp-Rauhe method) for 
producing monomer sequences; 

Step III implementing, with the NFR method, a grammar as defined in Step 
I, by producing with the NFR method monomer sequences that 
unambiguously represent said set of rules R of a grammar G; 




In re Application of 



PRELIMINARY AMENDMENT 
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Step IV assembling, from said monomer sequences manufactured in Step 
HI, for each rule of said set of rules R of G an oligomer representing that 
rule; 

Step V linking said oligomers assembled in Step IV to information-carrying 
polymers coding words of said regular grammar G. 

Please replace claim 2 with the following amended version: 

2. (Amended) The method according to claim 1, wherein said terminal 
alphabet £ of a grammar G contains terminals 0 and 1 , n start terminals s (s 0 , Si, . . . , sj and 
m end terminals e (eo, ei, . - 0, wherein n and m are integers larger than or equal to 0. 

Please replace claim 3 with the following amended version: 

3. (Amended) The method according to claim 1 , wherein said monomer 
sequence constructed in Step in includes nucleotides. 

Please replace claim 4 with the following amended version: 

4. (Amended) The method according to claim 3, wherein said monomer 
sequence constructed in Step III includes protein recognition sequences. 

Please replace claim 5 with the following amended version: 

5. (Amended) The method according to claim 1 , wherein said synthesis of 
said monomer sequences in Step II and HI is carried out in vitro. 

Please replace claim 6 with the following amended version: 

6. (Amended) The method according to claim 1 , wherein said information- 
carrying polymers obtained in Step V are ligated into cloning vectors; 

competent cells are transformed with these vectors; and 

the transformed cells are selected according to selection markers. 

Please replace claim 7 with the following amended version: 



-2- 



Serial No.: PCT/EP00/02893 
Filed: March 31, 1999 
Page 3 of 19 

7. (Amended) The method according to claim 1, further comprising 
obtaining one pair of anti-sense primers, each of said primers is mixed into at 

least n-1 solutions containing said information-carrying polymer, wherein n 
^ is the number of oligomers contained as elongators in said polymer; 

carrying out at least n-1 PCR approaches, wherein n is the number of 

oligomers contained as elongators in said polymer, and one primer of each 

pair primes in the terminator opposite to the elongator and the other primer 

primes in the elongator itself; 
obtaining polymer fragments by PCR, wherein said polymer fragments are 

separated by length using electrophoresis; and 
optically reading out a pattern obtained by electrophoresis. 

Please replace claim 8 with the following amended version: 

8. (Amended) The method according to claim 7, wherein said reading out of 
said pattern is performed automatically with a scanner or a sequencing machine. 

Please replace claim 9 with the following amended version: 

9. (Amended) Information-carrying polymers produced by the steps 
comprising: 

defining a regular grammar G = (£, V, R, S) with a finite terminal alphabet £, 

a finite set of variables V, a finite set of rules R, and a start symbol S; 
being the NFR method (Niehaus-Feldkamp-Rauhe method) for producing 

monomer sequences; 
implementing, with the NFR method, a grammar as previously defined, by 

producing with the NFR method monomer sequences that unambiguously 

represent said set of rules R of a grammar G; 
assembling, from said monomer sequences manufactured as defined, for each 

rule of said set of rules R of G an oligomer representing that rule; 
linking said oligomers assembled in Step IV to information-carrying polymers 
coding words of said regular grammar G. 

-3- 
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Please replace claim 10 with the following amended version: 

10. (Amended) A random number generator comprising information-carrying 
polymers. 

Please replace claim 1 1 with the following amended version: 

1 1 . (Amended) The information-carrying polymers of claim 9, wherein the 
polymers comprise part of a polymeric data storage. 

Please replace claim 12 with the following amended version: v 

12. (Amended) The information-carrying polymers-of claim 9, wherein the 
polymers comprise part of a DNA computer. 

Please replace claim 13 with the following amended version: 

13. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers comprise part of a biochip. 

Please replace claim 14 with the following amended version: 

14. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to manufacture molecular weight standards. 

Please replace claim 15 with the following amended version: 

15. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to represent data structures. 

Please replace claim 16 with the following amended version: 

16. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used as markers or signatures. 

Please replace claim 17 with the following amended version: 
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17. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of quality assurance. 

Please replace claim 18 with the following amended version: 

1 8. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of forgery protection. 

Please replace claim 19 with the following amended version: 

19. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling genetically engineered products. 

Please replace claim 20 with the following amended version: 

20. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling food. 

Please replace claim 21 with the following amended version: 

21 . (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling organisms. 

Please replace claim 22 with the following amended version: 

22. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling chemical products. 

Please replace claim 23 with the following amended, version: 

23. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling medical and pharmaceutical products. 

Please replace claim 24 with the following amended version: 

24. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling documents. 
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Please replace claim 25 with the following amended version: 

25. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling money. 

Please replace claim 26 with the following amended version: 

26. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling objects and machinery. 

Please replace claim 27 with the following amended version: 

27. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of labeling liquids, solutions, suspensions, or emulsions. 

Please replace claim 28 with the following amended version: 

28. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to encrypt information. 

Please replace claim 29 with the following amended version: 

29. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for the purpose of authenticating persons and objects. 

Please replace claim 30 with the following amended version: 

30. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used as molecular-scale adhesives. 

Please replace claim 3 1 with the following amended version: 

3 1 . (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to manufacture or process molecular structures. 

Please replace claim 32 with the following amended version: 
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32. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used for quality control of synthetically produced oligonucleotides. 

Please replace claim 33 with the following amended version: 

33. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to manufacture biochips. 

Please replace claim 34 with the following amended version: 

34. (Amended) The method according to claim 1 , wherein 1 - to n-byte nucleic 
acids are obtained. 

Please replace claim 35 with the following amended version: 

35. (Amended) The method according to claim 1, wherein, 1- to n-byte 
biochips are obtained. 

Please replace claim 36 with the following amended version: 

36. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers comprise part of biochips, wherein said biochips are used for data storage. 

Please replace claim 37 with the following amended version: 

37. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers comprise part of biochips, wherein said biochips are used to manufacture optical 
display devices or display screens. 

Please replace claim 38 with the following amended version: 

38. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to label individual molecules. 

Please replace claim 40 with the following amended version: 
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40. (Amended) Use of nucleic acids to encrypt information, wherein short 
nucleic acid sequences are used as a key for decryption. 

Please replace claim 41 with the following amended version: 

41 . (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are used to encrypt information, wherein the information-carrying polymer is 
concealed in a multitude of other polymers. 

Please replace claim 42 with the following amended version: 

42. (Amended) Use of polymers for the purpose of labeling, wherein the 
polymers are encrypted. 

Please cancel claim 43 without prejudice. 

Please replace claim 45 with the following amended version: 
45. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers are read using biochips. 

Please add new claims 48-53 as follows: 

48. (New) The method according to claim 1, wherein said monomer sequence 
constructed in Step DI includes ribonucleotides. - 

- 49. (New) The method according to claim 1, wherein said monomer sequence 
constructed in Step HI includes deoxyribonucleotides. - 

- 50. (New) The method according to claim 4, wherein said protein recognition 
sequences are selected from the group consisting of restriction cut sites, protein binding sites, 
and stop codons. 
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-51. (New) The information-carrying polymer of claim 9 comprising part of a 
random number generator. — 

- 52. (New) The information-carrying polymers of claim 9, wherein the polymers 
are used to label individual nucleic acids. — 

- 53. (New) The information-carrying polymers of claim 9, wherein the polymers 
are used to label genes. — 



REMARKS 

Entrance of this Amendment is respectfully requested. Attached hereto is a marked 
version of the changes made to the specification and claims by the current Amendment. The 
pages captioned "MARKED-UP VERSION OF AMENDMENTS" include marking to show 
the changes made to the claims by the current Amendment. 

Status of Claims (Amendments) 

Applicants have cancelled claim 43, amended claims 1-38, 40-42, and 45, and added 
new claims 48-53. Claims 1-42 and 43-53 are pending in this application with claims 1, 9, 
10, 39, 40, 42, 44, 46, and 47 being the only independent claims. Examination and 
consideration of the claims are respectfully requested. 



Respectfully submitted, 




Reg. No. 46,748 

SHINJYU GLOBAL IP COUNSELORS, LLP 

1233 Twentieth Street, NW, Suite 700 

Washington, D.C. 20036 

(202)-293-0444 / 

Dated: °j I Jo I 

G:\09-Sep-SMM\P#TEP00 0E893 
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MARKED-UP VERSION OF AMENDMENTS 

IN THE CLAIMS : 

Claim 1 was amended as follows: 

1 . (Amended) A method for producing information-carrying polymers, 
comprising: 

Step t defining a regular grammar G = (X, V, R, S) with a finite terminal 
alphabet a finite set of variables V, a finite set of rules R, and a start 
symbol S; 

Step It being the NFR method (Niehaus-Feldkamp-Rauhe method) for 
producing monomer sequences; 

Step IIL implementing, with the NFR method, a grammar as defined in Step 
I, by producing with the NFR method monomer sequences that 
unambiguously represent the said set of rules R of a grammar G; 
Step IVt assembling, from the said monomer sequences manufactured in 
Step m, for each rule of the said set of rules R of G an oligomer 
representing that rule; 

Step Vt linking the said oligomers assembled in Step IV to information- 
carrying polymers coding words of said regular gr ammar G. 

Claim 2 was amended as follows: 

2. (Amended) The method Method according to claim 1 , characterized in that 
the wherein said terminal alphabet X of a grammar G contains terminals 0 and 1, n start 
terminals s (s 0 , Si, . . sj and m end terminals e (eo, e b . . ej, wherein n and m are integers 
larger than or equal to 0. 

Claim 3 was amended as follows: 



- 10- 



Serial No.: PCT/EPOO/02893 
Filed: March 31, 1999 
Page 11 of 19 



3. (Amended) The method Method according to claim 1 wherein said 
characterized in that the monomer sequence constructed in Step m includes nucleotides^ 
particular ribonucleotides, most preferably deoxyribonucleotides . 

Claim 4 was amended as follows: 

4. (Amended) The method Method according to claim 3, wherein said 
characterized in that the monomer sequence constructed in Step m includes protein 
recognition sequences (such as restriction cut sites, protein binding s ites or stop codons) . 

Claim 5 was amended as follows: 

5. (Amended) The method Method according to any of the preceding claims 
claim L wherein said characterized in that the synthesis of the said monomer sequences in 
Step n and m is carried out in vitro , preferably with an oligonucleotid e synthesizer . 

Claim 6 was amended as follows: 

6. (Amended) The method Method according to claim 1, w herein said fef 
isolating and amplifying information carping polymers that have been obtained in 
accordance with any of the preceding claims, characterized in that the information-carrying 
polymers obtained in Step V are ligated into cloning vectors T i 

competent cells are transformed with these vectors T ; and 
the successfully transformed bacteria cells are selected according to selection 
markers. 

Claim 7 was amended as follows: 

7. (Amended) The method Method for reading information from information 
carrying polymers that have bwn nht**\™*A nr fonlnted and amplified in accordance with any 
of t he preced e r1 *"™ r v rfrnmrtflrizod in that accordin g to claim L further comprising 

a) obtaining one pair of anti-sense primers, each of said primers is mixed into at 
least n-1 solutions containing the said information-carrying polymer, 
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wherein n is the number of oligomers contained as elongators in the said 
polymer; 

b) carrying out at least n-1 PCR approaches, wherein n is the number of 

oligomers contained as elongators in the said polymer, and one primer of 
each pair primes in the terminator opposite to the elongator and the other 
primer primes in the elongator itself; 

e) obtaining polymer fragments obtained by PCR. wherein said polymer 
fragments are separated by length using electrophoresis; and 

4) the optically reading out a pattern obtained by electrophoresis is read out 
optically . 

Claim 8 was amended as follows: 

8. (Amended) The method Method according to claim 7, characterized in that 
the wherein said reading out of said pattern in Step d) is performed automatically with a 
scanner or a sequencing machine. 

Claim 9 was amended as follows: 

9. (Amended) Information-carrying polymers polymer obtained in accordance 
with one of the claims 1 to 6 produced by the steps comprising: 

defining a regular grammar G = (T. V, R, S) with a finite terminal al phabet 

a finite set of variables V. a finite set of rules R, and a start symbol S; 
being the NFR method (Niehaus-Feldkamp-Rauhe method) for producing 

monomer sequences; 
implementing, with the NFR method, a grammar as previously defined, by 

producing with the NFR method monomer sequences that unambiguously 

represent the said set of rules R of a grammar G; 
assembling, from said monomer sequences manufactured as defined, for each 

rule of said set of rules R of G an oligomer representing that rule; 
linking said oligomers assembled in Step IV to information-carrying polymers 
coding words of said regular grammar G . 
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Claim 10 was amended as follows: 

10. (Amended) A random Random number generator comprising information- 
carrying polymers , in particular polymers in accordance with claim 9 . 

Claim 1 1 was amended as follows: 

1 1 . (Amended) Polymeric data storage comprising The inform ation-carrying 
polymers in accordance with of claim 9 . wherein the polymer s comprise part of a polymeric 
data storage . 

Claim 12 was amended as follows: 

12. (Amended) DNA computer comprising The information-carrying polymers 
in accordance with of claim 9 . wherein the polymers comprise part of a DNA computer. 

Claim 13 was amended as follows: 

13. (Amended) Riochip comprising The information-carrying polymers in 
accordance with of claim 9 , wherein the polymers comprise part of a biochip. 

Claim 14 was amended as follows: 

14. (Amended) Use of The information-carrying polymers in accordance with 
of claim 9 , wherein the polymers are used to manufacture molecular weight standards. 

Claim 15 was amended as follows: 

1 5 . (Amended) Use of The information-carrying polymers in accordance with 
of claim 9 . wherein the polymers are used to represent data structures. 

Claim 16 was amended as follows: 

16. (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9 . wherein the polymers are used as markers or 
signatures. 
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Claim 17 was amended as follows: 

1 7 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
quality assurance. 

Claim 18 was amended as follows: 

1 8 . (Amended) Use of The information-carrying polymer s, in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
forgery protection. 

Claim 19 was amended as follows: 

1 9 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling genetically engineered products. 

Claim 20 was amended as follows: 

20. (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling food. 

Claim 21 was amended as follows: 

2 1 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling organisms. 

Claim 22 was amended as follows: 

22. (Amended) U s e of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling chemical products. 
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Claim 23 was amended as follows: 

23 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling medical and pharmaceutical products. 

Claim 24 was amended as follows: 

24. (Amended) Use-of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling documents. 

Claim 25 was amended as follows: 

25 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling money. 

Claim 26 was amended as follows: 

26 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling objects and machinery. 

Claim 27 was amended as follows: 

27 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
labeling liquids, solutions, suspensions^ or emulsions. 

Claim 28 was amended as follows: 

28 . (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used to encrypt 
information. 
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Claim 29 was amended as follows: 

29. (Amended) Use of The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the polymers are used for the purpose of 
authenticating persons and objects. 

Claim 30 was amended as follows: 

30. (Amended) Use-ef The information-carrying polymers , in particular of 
polymers in accordance with of claim 9, wherein the p olvmers are used as molecular-scale 
adhesives. 

Claim 31 was amended as follows: 

3 1 . (Amended) Use of The information-carrying polymers in accordance with 
of claim 9 . wherein the polvmers are used to manufacture or process smallest molecular 
structures. 

Claim 32 was amended as follows: 

32. (Amended) Use of The information-carrying polymers in accordance with 
of claim 9 . wherein th e polvmers are used for quality control of synthetically produced 
oligonucleotides. 

Claim 33 was amended as follows: 

33 . (Amended) Use of The information-carrying polymers in accordance with 
of claim 9 . wherein the polvmers are used to manufacture biochips. 

Claim 34 was amended as follows: 

34. (Amended) 1 to n byte nucleic acids, in particular The method according 
to claim L wherein 1- to n-byte nucleic acids are obtained by any of the claims 1 to 6 . 

Claim 35 was amended as follows: 
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35 . (Amended) 1 to n byte biochips, in particular The metho d according to 
claim 1, wherein, 1- to n-byte biochips are obtained by any of the claims 1 to 6 . 

Claim 36 was amended as follows: 

36. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers comprise part of biochips. wherein said biochips are used for Use of biochips, in 
particular of biochips in accordance with any of the claims 13, 33 and 35, as a data storage. 

Claim 37 was amended as follows: 

37. (Amended) The information-carrying polymers of claim 9, wherein the 
polymers comprise part of biochips. wherein said biochips are used Use of biochips, in 
particular of biochips in accordance with any of the claims 13, 33 and 35, to manufacture 
optical display devices or display screens. 

Claim 38 was amended as follows: 



molecules , in particular nucleic acids, most preferably of genes . 
Claim 40 was amended as follows: 

40. (Amended) Use of nucleic acids to encrypt information, wherein 
characterized in that for decryption, short nucleic acid sequences (primers) are used as the a 
key for decryption . 

Claim 41 was amended as follows: 



41 . (Amended) Use of The information-carrying polymers , in particular 

polymers in accordance with of claim 9, wherein the polymers are used to encrypt 
information, characterized in that wherein the information-carrying polymer is concealed in a 
multitude of other polymers. 
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Claim 42 was amended as follows: 

42. (Amended) Use of polymers for the purpose of labeling, characterized in 
feat wherein the polymers are encrypted. 

Claim 43 was cancelled without prejudice. 

Claim 45 was amended as follows: 

45 . (Amended) Method for reading information from The information-carrying 
polymers of claim 9, wherein the polymers are read using obtained or isolated and amplified 
in accordance with any of the claims 1 to 6, characterized in that biochips are used for the 
reading . 

New claims 48-53 were added as follows: 

— 48. (New) The method according to claim 1, wherein said monomer sequence 
constructed in Step EI includes ribonucleotides. - 

49. (New) The method according to claim 1, wherein said monomer sequence 
constructed in Step HI includes deoxyribonucleotides. — 

~ 50. (New) The method according to claim 4, wherein said protein recognition 
sequences are selected from the group consisting of restriction cut sites, protein binding sites, 
and stop codons. — 

— 51. (New) The information-carrying polymer of claim 9 comprising part of a 
random number generator. — 

— 52. (New) The information-carrying polymers of claim 9, wherein the polymers 
are used to label individual nucleic acids. — 
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- 53. (New) The information-carrying polymers of claim 9, wherein the polymers 
are used to label genes. — 
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INFORMATION-CARRYING AND 
INFORMATION-PROCESSING POLYMERS 

DESCRIPTION 

The present invention relates to methods for producing information-carrying 
5 polymers and to information-carrying polymers obtained with the following methods: to 
methods for isolation, amplification and selection of such information-carrying 
polymers; and to polymeric data storages and DNA-computers, which contain 
information-carrying polymers, as well as the use of information-carrying polymers for 
the production of molecular weight standards, as markers or signatures, for the 
10 encryption of information, for molecular-scale adhesives, or for the production or 
processing of miniature molecular structures. 

It is well-known to use nucleic acids to mark materials. U.S. Patent No. 
5,451,505 describes marking materials with nucleic acids of a length of 20 to lOOObp 
that may serve the purpose of authentication. However, some basic problems, such as 
15 the construction of appropriate sequences or the appropriate coding for the 
representation of information are not discussed, or even satisfactorily solved. 

It is well-known that nucleic acids such as DNA (deoxyribonucleic acid) and 
RNA (ribonucleic acid) can be used outside of living organisms to process information. 
Several different approaches for using DNA molecules to process information have 

20 been proposed. 

WO 97/07440 and [L.M. Adelman, Molecular Computation of Solutions to 
Combinatorial Problems, Science, 266, 1021-1024, (1994)] describe a method for 
calculating a graph algorithm ("Hamiltonian path problem") as a decision problem 
using DNA molecules. The algorithm is implemented by representing the edges and 

25 nodes of the graph by DNA sequences. The calculation of the algorithm is performed 
by generating a set of paths of the graph by hybridizing complementary DNA 
sequences. 

From this set of paths, all paths that are a false solution or no solution of the 
algorithm are removed by biomolecular methods. Once all steps have been executed 
30 successfully, either at least one DNA strain containing the correct solution of the 




problem is left in the end, or no DNA strain is left, which means that there is no solution 
for the algorithm. 

The above-described method presupposes that enough molecules are available 
that every possible solution of the respective problem instance occurs statistically at 
5 least once in the original set of paths. However, this cannot be guaranteed due to the 
statistical nature of the hybridization operation. Furthermore, it is possible that 
cyclical graphs are created in the first step, which increases the number of false 
solutions. 

The approach described in WO 97/07440 shows that symbol processing in vitro 
10 with DNA molecules is in principle possible. However, the described approach can 
hardly be used in practice: One problem is that the algorithm is not deterministic but 
only stochastic, and the obtained result is only at a certain probability correct. A 
further problem is that the implementation of the algorithm is not efficient (run-time 
optimized), so that the calculation is slow. 
15 It is explained that the method used is particularly suitable for the solution of 

NP-complete problems (a class of problems for which only deterministic solution 
algorithms with exponential run-time are known, and for which it is assumed that 
efficient deterministic solution algorithms do not exist), because the algorithm 
necessitates only a linear run-time due to the parallelism of the executed operations. 
20 However, this argumentation is erroneous, because the linear run-time is compensated 
by an exponential number of molecules. Thus, the exponential nature of the problem 
remains unchanged. 

All in all, the method described in WO 97/07440 is too limited to be useful for 
molecular information processing beyond the described algorithm: The Hamiltonian 
25 path problem is permanently coded, other algorithms cannot be calculated, and every 
problem instance has to be coded anew. Programming is not possible, and all steps of 
the algorithm are executed manually. There is no input/output system. All in all, the 
method is not suitable for programming algorithms or for implementing a computer. 

WO 97/29117 and [Frank Guarnieri, Makiko Hiss, Carter Bancroft, Making 
30 DNA Add, Science, 273, 220-223 (1996)] describe a method for performing additions 
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with DNA molecules. The addition is carried out as a shift operation with overflow 
carry in vitro with DNA molecules and primer elongation. 

The described method describes additions only. However, even the use as an 
adder is unfavorable for at least two reasons: Firstly, the addition with the described 
5 method is not implemented efficiently (run-time optimized). Moreover, the used 
system is formally incomplete, because the results obtained by addition cannot be used 
as numbers for further calculations. Concepts that go beyond the addition are lacking. 
All in all, the method is not suitable for programming algorithms or for implementing a 
computer. 

10 [Qi Ouyang, Peter D. Kaplan, Shumao Liu, Albert Libchaber, DNA Solution of 

the Maximal Clique Problem, Science, 278, 446-449, (1997)] describes a method for 
implementing an algorithm for solving the "max-clique problem." The algorithm is 
derived from graph theory and belongs to the NP-complete problems. Like [L.M. 
Adelman, Molecular Computation of Solutions to Combinatorial Problems, Science, 

15 266, 1021-1024, (1994)], the method can calculate only permanently coded solution 
algorithms. Also in this case, the result obtained is only a certain probability. The 
method described by the authors contains no continuing concepts (for example 
regarding the implementation of other algorithms). All in all, the method is not 
suitable for programming algorithms or for implementing a computer. 

20 [Eric Winfree, Xiaoping Yang, Nadrian C. Seeman, Universal Computation via 

Self-assembly of DNA: Some Theory and Experiments, Proceedings of the 2nd 
DIMACS Meeting on DNA Based Computers, Princeton University, June 20-12, (1996)] 
discusses the implementation of regular grammars in DNA by linking oligonucleotides 
with "sticky ends." However, the discussion is purely theoretical and fails because of 

25 substantial, hitherto unsolved problems. In particular, it is not shown how the 
sequences necessary to implement grammars have to be constructed. Yet this is the 
crucial problem that has to be solved in order to implement grammars. The reason for 
this is that sequences that represent the variables and terminals of a grammar have to be 
unambiguous as well as sufficiently dissimilar to one another. Moreover, they have to 

30 fulfill certain structural, chemical and physical conditions. Otherwise, failed 



hybridizations among sequences are inevitable, which lead to undesired chain 
extensions and chain break-off during polymerization, thus making proper functioning 
of the method impossible. This problem is aggravated exponentially with the number 
of necessary sequences. 
5 The described method is rejected by the authors themselves, or referred to as 

insufficient insofar as it does not allow very interesting calculations to be done (see Erie 
Winfree, Xiaoping Yang, Nadrian C. Seeman, Universal Computation via Self-assembly 
of DNA: Some Theory and Experiments, Proceedings of the 2nd DIMACS Meeting on 
DNA Based Computers, Princeton University, June 20-12, (1996), page 8, second 
10 paragraph, line 1). 

Further experiments by the authors are therefore not based on regular 
grammars and linear polymers, but on context-sensitive grammars for the generation of 
DNA lattice structures [Eric, Winfree, Furong Liu, Lisa A. Wenzler & Nadrian C. 
Seeman, Design and self-assembly of two-dimensional DNA crystals, Nature, 394, 
15 539-544, (1998)]. The approach described by the authors is possibly useful for the 
generation of DNA based nano-structures (fabrication of catalysts, etc.). However, for 
information processing, the approach is not suited, as the generated lattice structures 
tend to be troublesome. v 

The authors raise the possibility of realizing the mathematical concept of 
20 "Wang tiles" with the generated lattice structures and possibly using it for calculations, 
but they merely mention this possibility, without describing how it could be used in the 
context of information processing. 

It is doubtful whether the approach described by the authors is suitable at all 
for information processing: The lattice structures prevent the amplification of the 
25 information-carrying sequences (for example by cloning or PCR = polymerase chain 
reaction), and make it impossible to use the created structures as data, since they cannot 
be read out anymore, for example by PCR. Consequently, the selected approach in its 
current form presents conceptually no possibility to use the created structures in the 
context of information processing, for example as data. 
30 From the conventional technology, no method is known that makes it possible 




to use DNA molecules for the implementation of efficient algorithms. No method is 
known to execute calculations automatically (for example in form of a molecular 
computer). Furthermore, no method is known for implementing programs in the form 
of regular grammars with molecular methods, such that 
5 a) different (ideally any desired) regular grammars can be implemented; 

b) words of a language that have been generated by regular grammars can be read 
out; 

c) the words of a language that have been generated by regular grammars can be 
further used for technical purposes (such as information processing, polymer 

10 chemistry). 

The problem solved by the present invention is to present a method with which 
it is possible to carry out information processing on molecular basis with regular 
grammars, without being limited to one or a few grammars. The method should be 
compatible to conventional computers and allow information processing that is partly 
15 molecular-based and partly based on conventional computers. The method should be 
programmable and it should be possible to largely automate the method. The polymers 
generated in the method should be readable and usable for further applications and 
methods. 

This problem is solved by a method for manufacturing information-carrying 
20 polymers, including: 

I. defining a regular grammar G = (X, V, R, S) with a finite terminal alphabet X> a 

finite set of variables V, a finite set of rules R, and a start symbol S; 

n. the NFR method (Niehaus-Feldkamp-Rauhe method) for producing monomer 

sequences (oligomers or polymers); 
25 m. implementing, with the NFR method, a grammar defined in Step I, by 

producing with the NFR method monomer sequences that unambiguously represent the 

set of rules R of a grammar G; 

IV. assembling, from the monomer sequences produced in Step m, for each rule of 
the set of rules R of G an oligomer (algomer) representing that rule (algomer assembly); 
30 V. linking the oligomers (algomers) assembled in Step IV to information-carrying 



polymers (symbol polymerization). 

In the framework of the present invention, the following definitions and 

abbreviations are used: 

algomer double-stranded oligomer that represents a rule of a given grammar. 

Algomers can be linked with one another to logomers. 
readout PCR PCR that is used to read out the information contained in the 

logomers. 

Biochip carrier of a number of nucleic acids that are used to detect 

complementary sequences; also referred to as microarray, DNA array, 
gene array, or gene chip. 

bit polymerization process of concatenating algomers together into logomers, 

when there are only two different elongators. 

byte information unit consisting of 8 bits, or molecule representing 8 bits. 

elongator algomer with two overhang sequences; can ligate with a terminator or 

an elongator and leads to chain elongation. 

grammar formalism describing languages. The formalism is based on a formal 

theory of languages [Chomsky, N., Three models for the description of 
language, /ACM, 2:3, 113-124, (1956)], [Chomsky, N., On certain 
formal properties of grammars, Inf. and Control, 2:2, 137-167, (1959)], 
[Chomsky, N., Formal properties of grammars, Handbook of Math. 
Psych., 2, 323-418,(1963)] 

A grammar G describes a language L(G), the alphabet of this language 
and its syntax. With a grammar G, all words of that language can be 
generated. 

A grammar G is a quadruple (£, V, R, S) with a terminal alphabet £, a 
set of variables V, a start symbol S and a set of rules R. 
logomer polymer that carries symbolic information and that has been generated 

by the linking of algomers. Correspondingly, a logomer is made of 
repeating units of algomers. A logomer represents a word of a 
language L(G), that is generated by a corresponding grammar G. 
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monomer single molecule. A plurality of monomers can be linked to long 

chains, and thus form oligomers and polymers. 

In the case of deoxyribonucleic acid, the nucleotides (adenine, 

cytosine, guanine, thymine, as well as analogous bases such as 

hypoxan thine, etc.) are the monomers 
multibyte an arbitrary data structure that is made of a plurality of bytes, 

oligomer short-chain molecule made of repeating units (monomers). Also, 

short double-stranded molecules are referred to as oligomers. 
PCR polymerase chain reaction; method for the exponential amplification 

of DNA. 

PCR needs a DNA template to be amplified and two primers that start 
in opposite directions in the template and serve as the starting points 
of a DNA polymerization. By iterative repeating of 
melt-hybridization-polymerization cycles, the DNA template is 
multiplied. 

polymer long-chain molecule made of repeating units (monomers), 

rule also: production rule, substitution rule, or derivation rule. 

Describes the substitution of symbols by other symbols. Symbol 

chains can be generated by repeated application of rules, 
sequence Information theory: Series of symbols. 

Chemistry: Series of covalently bonded monomers. 

Molecular biology: Series of covalently bonded nucleotides, 
complementarity two sequences are complementary when they can hybridize with one 

another. For example, in the case of DNA, the sequences 5 ' att3 ' 

and 5 ' aaat3 ' are complementary, and the sequence 5 ' acgt3 ' is 

complementary to itself, 
start symbol variable within a grammar, starting from which a symbol chain can be 

generated by application of the rules of the grammar, 
symbol polymerization the process of concatenating algomers into logomers. 
terminal symbol of a grammar. A terminal cannot be substituted any further. 
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Terminals are the "letters" of the words of a language L(G). 
algomer with one overhang sequence. Can ligate only with an 
elongator. Leads to chain break-off in a symbol polymerization, 
unambiguity of sequences among one another. A sequence S of 
monomers is 10-unique to a set M of other sequences, if no partial 
sequence of S of a length 10 appears in any other sequence of the set 
M. The uniqueness can also be given in percent. For example, two 
sequences of length 20, whose longest common partial sequence is 
five monomers, are 6-unique to one another, and their uniqueness in 
percent is (1 - 5/20)* 100 = 80%. 

symbol of a grammar. According to a rule of a grammar, a variable 
can be substituted by terminals, variables or combinations of terminals 
and variables. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a view of algomers as described in the explanation of Step IV of the 
method of the present invention. The algomers each have a double-stranded core 
sequence, that represents a terminal of a given grammar and at least one single-stranded 
overhang sequence, which represents a variable of the given grammar, so that an 
algomer represents exactly one rule of the grammar. X and Y are not variables, but 
overhang sequences, that can be used for cloning, for example. The letters marked 
with bars represent complementarity. According to the definition, A0A and A1A are 
elongators, XsA and AeY are terminators. The algomers shown here represent the 
grammar for the generation of binary random numbers described below. 

Fig. 2 is a view of symbol polymerization as described in the explanation of 
Step V of the present invention. Algomers are linked into logomers by concatenation 
(in the case of DNA: Hybridization and ligation). The logomer XsOlOlOlOleY 
contains a terminated bit series that can be amplified by the overhangs X and Y in a 
definite orientation. 

Fig. 3 is a view of a pattern illustrating the symbol polymerization of binary 
random numbers after gel electrophoresis and coloring (tracks 1 to 3). Because of the 



terminator 



uniqueness 
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random length of the binary random numbers, the ladder pattern is regular. Track 4 
shows a 50bp molecular weight standard. (Gibco BRL, Life Technologies, Catalogue 
No. 10416-014). 

Fig. 4 is a view of cloning of a logomer obtained by symbol polymerization in 
5 a vector, as described in the following for the amplification and isolation of logomers. 

Fig. 5 is a view of a diagram schematically illustrating a band pattern that is 
obtained by reading out a binary logomer by PCR and subsequent gel electrophoresis 
(described below). In the illustrated example, the algomers have a length of 30bp 
(taking overhang sequences as 1/2). For logomers with known length, it is sufficient to 
10 read only the O's or l's. Reading both kinds of bits is good for control. The method 
can also be used for polyvalent logomers (with more than two elongators). 

Fig. 6 is a view of a band pattern of a gel electrophoresis after PCR to read out 
three different logomers. The logomers have been obtained in accordance with the 
grammar for random numbers of any length, which is described below. The random 
15 numbers, read from bottom to top, are: a = 262, b = 97, c = 329. M (track 5) is a 50bp 
molecular weight standard (Gibco BRL, Life Technologies, Catalogue No. 10416-014). 

Fig. 7 is a view of a diagram schematically illustrating the reading out of 
logomers by restriction digestion as described below. The elongators carry restriction 
cut sites, that are arranged asymmetrically, so that the restriction digestion of logomers 
20 results in an unambiguous cutting pattern of fragments. The cutting pattern 
corresponds to bands of various lengths, and can be made visible by gel electrophoresis. 
Rl and R2 are different restriction enzymes, and x and y denote the length of the 
fragments obtained after the restriction digestion. An x:y ratio of 1 :2 is suitable. 

Fig. 8 is a view of band patterns that are obtained by gel electrophoresis after 
25 the reading out of logomers by restriction digestion. Tracks 7 and 8 contain molecular 
weight standards (track 7: 50bp ladder, Gibco BRL, Life Technologies, Catalogue No. 
10416-014; track 8: lObp ladder). 

Fig. 9 is a view of a schematic representation of a polymeric data storage based 
on algomers and symbol polymerization as described below. Algomers can 
30 polymerize at anchor molecules (AM) bonded to a solid carrier (C). Writing (W) is 



10 



carried out by a repeating (ReW) of hybridization-ligation-restriction cycles (Hyb, Lig, 
Res). The obtained logomers can be separated and read out by denaturing or 
restriction (Den/Dig). 

Fig. 10 is a view of a simplified diagram showing the components of a DNA 
5 desktop computer as described below. The components are: 

A: oligo-synthesizer, B: thermal cycler, C: reaction chambers, D: pipetting device, 
E: Gel, F: scanner, G: control computer 
The numerals denote: 

1: addition of oligonucleotides, 2: addition of synthesized oligomers in reaction 
10 chambers of the thermal cycler, 3: addition of solutions and molecules needed for 
algomer assembly, symbol polymerization, for isolating logomers and for reading out. 

Fig. 11 is a view of an encryption of logomers as described below. If the 
terminators and the primers priming therein are unknown, then the corresponding 
logomer is encrypted and cannot be read out (A). If, on the other hand, a primer 
15 priming in a terminator is available or the sequence of a terminator is known, then the 
corresponding logomer can be read out (B). 

Fig. 12 is a view of a Y-shaped molecule that can be used as a terminator for 
the asymmetric encryption of logomers. Elongators can be linked to the end marked 
with the 2, and further for example Y-shaped molecules to the ends marked 1 and 3. 
20 Linking several Y-shaped molecules with one another, tree-like structures are obtained. 

Fig. 13 is a view of logomer with the terminators s and e that are realized as 
tree-like structures. 

Fig. 14 is a view of a logomer V with tree-like terminators that is linked with a 
vector V. The terminators are configured such that only one branch each of the 
25 terminator can be linked with the vector. 

Fig. 15 is a view of marking of nucleic acids with logomers as described 
below: In order to mark genetically engineered or modified products, a logomer is 
applied using recombinative techniques in a not-transcribed area, e.g. before the 
promoter (P) of a gene (G) to be marked. 
30 Fig. 16 is a view of marking of documents with logomers as described below. 
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Track 1 shows the used molecular standard (50bp, Gibco ZBRL, Life Technologies, 
Catalogue No. 10416-014). Track 2 (0 bits) and track 3 (1 bits) show the band pattern 
of the readout of logomer No. 330 by PCR from an aqueous solution. Tracks 4 and 5 
are the same as tracks 2 and 3, but here the logomer No. 330 is not read from an 
5 aqueous solution but was introduced to the readout PCR in form of paper scraps (of 
about 1mm 2 ) of 10 9 molecules/jil dried on paper (3M Postlt, dried for 1 hour), 

Fig. 17 is a view of production of molecular weight standards by readout PCR, 
containing a unary logomer as a template, as described below. 1 shows the 
arrangement of nested primers in the readout PCR, 2 shows the band pattern obtained 

10 after gel electrophoresis of the PCR fragments. 

Fig. 18 is a view of gluing of surfaces with nucleic acids as described below. 
C denotes the surface to be glued, Log denotes the logomers that are bonded to the 
surfaces for gluing. 1 shows the behavior of areas with non-complementary logomers 
and 2 shows the behavior of areas with complementary logomers. 

15 Fig. 19 is a view of gluing of surfaces with logomers, antibodies and ligands as 

described below: Co and Ci denote the surfaces to be glued, which can be of the same 
or of different materials. Logomers (Log) are applied on the surfaces to be glued. 
Proteins, such as antibodies of the type Ab A and Ab B can bond to the logomers, 
wherein Ab A and Ab B can be the same or different. Ab A and Ab B are bonded to 

20 one another by a ligand (Li). 

Fig. 20 is a view of gluing of surfaces with proteins, such as antibodies, 
without using logomers. Co and Ci denote the surfaces to be glued, which can be of 
the same or of different materials. Proteins of the type Ab A and Ab B bond directly 
to the surfaces to be glued, and are bonded to one another by a ligand (Li). 

25 Fig. 21 is a view showing that when linking sequences to longer sequence 

chains, there may be violations of the required uniqueness, which may lead to failed 
hybridizations and thus to disfunctionality. This is caused by the generation of new 
base sequences (marked in the diagram) due to linking. 

Fig. 22 is a view of an example of the linking of terminals with a variable. 

30 The four illustrated rules of a grammar with a variable A result in four different paths, 
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that overlap over the length of the variable sequence A. Furthermore, some of the 
sequences for multiple terminals (b, c) overlap, so that three paths converge on the left 
and on the right. 

Fig. 23 is a view of showing that if more than four different variable sequences 
5 (A, B, C, D, E) have transitions to the same terminal sequence (0), then at least one base 
sequence has to be used several times. The dotted frame indicates the iteration step, at 
which a violation of the uniqueness has to be tolerated, in order to be able to translate 
all rules ofR. 

Fig. 24 is a view showing parallel filling in of two sequence elongations for the 
10 variables A and B. The dotted frame indicates the iteration step, at which the starting 
base sequences for the path search are located. From the next iteration step onward, 
violations of the uniqueness can be tolerated, if necessary. Note that this results in 
branching beyond group borders (e.g. of the terminal b to the variables A and B). 

Fig. 25 is a view of a diagram of a molecule representing 1-byte. The section 
15 marked by x contains an unambiguous sequence that represents a predetermined byte 
value (the 256 nucteic acid sequences needed for the representation of all byte values 
are listed in the sequence listing). The section marked s contains an unambiguous 
sequence, that codes the byte position of the corresponding byte within multibytes and 
serves as a template for PGR reactions (see sequence listing). The sections marked o 
20 and e are for the production of multibytes from single bytes and are used as templates 
for PCR reactions (see sequence listing). 

Fig. 26 is a view of a schematic representation of four 1-byte molecules with 
different byte positions (so - S3). The single bytes can be connected to multibytes. 

Fig. 27 is a view showing that for amplification, individual byte molecules can 
25 be cloned in genetic vectors (plasmids). 

Fig. 28 is a view showing an example of the construction of a byte-representing 
DNA molecule. The molecule includes several functional sub-units: For X, there are 
256 unambiguous base sequences, which represent all values of a single byte. S is an 
unambiguous sequence that represents the position of a byte (namely, the subsequent X 
30 in the strand) within multibytes. O and E are used as unambiguous recognition 
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sequences for the concatenation of single bytes to multibytes. 

DNA bytes can be used for labeling can be non-concatenated or concatenated. Cutting 
out X or SX sub-units yields DNA sections that are used for the "spotting" of biochips. 
The resulting biochips are used for the reading out of single bytes or of multibytes. 

5 Fig. 29 is a view showing concatenation of bytes to multibytes. In this 

example, four bytes are linked to a 32-bit data structure. Moreover, the sequences L 
and R at the ends can function as adaptors, in order to introduce the 32-bit DNA data 
structure into another DNA. They can carry either specific recombination sites or 
restriction sites. An example is the labeling of a plasmid in Fig. 3 1 . 

10 Fig. 30 is a view of a diagram showing the configuration of a 32-bit molecule, 

that has been produced by the linking of four 1-byte molecules. The molecule can be 
admixed or tacked for labeling to substances to be marked. For the labeling of nucleic 
acid constructs and genes, the sequences denoted L and R can carry restriction sites or 
recombination sites, with which they are connected to the molecule to be marked. 

15 Fig. 31 is a view showing labeling of a plasmid with a 32-bit molecule. The 

0 th byte has the value 109, the first byte has the value 67, the second byte has the value 
35, and the third byte has the value 192. As an unsigned 32-bit, the byte patterns of 
the marked plasmid corresponds to the number 3223536493. 

Fig. 32 is a view of a diagram of a 1-byte biochip, that has been spotted with all 

20 256 x-fragments (see Fig. 25 to Fig. 28) from x 0 to x 255 (X-chip). Only some of the 
256 different sequences are shown. To read out multibytes, the corresponding single 
bytes are preamplified by PCR (s to e) and hybridized individually on separate chips. 
Thus, four PCR reactions and four chips are necessary for a 4-byte molecule (see Fig. 
30). 

25 Fig. 33 is a view of a diagram of a 1-byte biochip, that has been spotted with all 

256 x-fragments (see Fig. 25 to Fig. 28) from s 0 x 0 to s 0 x 25 5 (X-chip). Only some of the 
256 different sequences are shown. In contrast to the chip type of Fig. 32, the 
hybridization conditions can be selected such that bytes can be detected in dependence 
of their position in a multibyte. In this example, only s 0 Xi, but no s n Xi with n # 0 are 

30 detected. Similarly, it is possible to manufacture 1-byte chips that detect only sixi, etc. 




This makes it possible to manufacture chips, that can read out multibytes directly 
(without PCR) and completely. An example of this is the 4-byte chip shown in Fig. 
37. 

Fig. 34 is a view of a layout of a 1-byte chip. If the chip is implemented as an 
5 X-chip (see Fig. 32), then it contains 256 spots with all 256 x-fragments (see sequence 
listing). If it is implemented as an SX-chip (see Fig. 33), then it contains 256 spots 
with all SiX-fragments. Each x-fragment represents exactly one byte value (xo = 0, xi = 
1\, X255 = 255). The sequences for s and x are selected such that they have melting 
temperatures that are as similar to one another as possible, but sequences that are as 

10 different from one another as possible, in order to preclude failed hybridizations. 

Fig. 35 is a view showing manufacturing of multibyte chips from 1-byte 
SX-chips. This example shows the manufacturing of a 4-byte chip of one SXo chip, 
one SXi chip, one SX2 chip, and one SX3 chip. Multibyte chips can be arranged to (a) 
linear or (b) two-dimensional byte arrays. 

15 Fig; 36 is a view of a read out a 32-bit molecule- (see Fig. 30 and Fig. 31) with 

a 4-byte SX-chip (see Fig. 35). For the readout, 32-bit molecules can be hybridized 
directly with the entire chip. Thus, the 32-bit value can be read out directly (marked in 
the diagram). Moreover, bytes can be amplified independently from one another and 
hybridized separately. Alternatively, a multibyte-chip can be made of identical 8-bit 

20 units, that are spotted only with X-fragments. In order to maintain the position 
information of the bytes, the individual bytes must be spotted separated from one 
another in the corresponding sectors. 

Fig. 37 is a view of a multibyte array of identical 1-byte chips (X-chips). 
Multibyte arrays of X-chips can also be used to read out marking molecules as 

25 described above. In addition, multibyte arrays can be used for storage and for the 
optical display of computer data. 

The Sequence Listing shows molecules that are needed for the production of 
the 32-bit molecules described below. The molecules are assembled to algomers, as 
described below. The o, s, e, and x units are assembled to bytes, as described below. 

30 These are, in turn, assembled to multibytes (in this example to 4-byte 




= 32-bit molecules). The 32-bit molecules are then used for labeling and marking and 
are also used for the manufacturing of biochips (as described below). 
Explanation of I: Selection of Grammars 

A grammar is a quadruple G = (I, V, R, S) with a finite terminal alphabet £, a 
5 finite set of variables V, a finite set of rules R and a start symbol S. All words of a 
language L(G) that is described by G can be formed as words of the terminal alphabet 
by derivation from the start symbol S according to the rules R. 

The definition of a grammar G according to Step I of the method of the present 
invention is free in that any finite set of terminals and variables can be coded. 
10 However, preferable for the present invention is the definition of grammars that allows 
the binary representation of data, in particular of grammars with 
X := {0, 1, so, si, s 2 , Sn-i, s„, eo, ei, e 2 , e m -i, e m } with n, m e IN, 
and n, m ^ 0. 

This allows the manufacturing of binary logomers. 
15 For illustration, the following illustrates a regular grammar, defined in 

accordance with the present invention, for generating random numbers of arbitrary 
finite lengths in binary notation: 

The grammar G = (Z, V, R, S) has a finite terminal alphabet £ := {0, 1, s, e}, a set of 
variables V := {A}, a start symbol S and a set of rules 

20 R := 
{ 

S := sA 
A OA 
A-» 1A 
25 A* e 

} 

wherein 
s := start 
e := end. 



With this grammar, words of the terminal alphabet 
l:={0,l,s,e} 

can be formed. All words of the language L begin with "s" and end with "e", and in 
between there is a random number (including zero) of random bits ("0" and "1")* 
5 Words of the language L that have been formed according to R are for example: 

a) S ^ sA ^ se 

b) S-»sA-*slA-*sle 

c) S sA sOA sOOA sOOlA sOOlOA sOOlOe 

d) S sA slA slOA slOOA slOOOA slOOOOA slOOOOOA 
10 slOOOOOe 

The binary patterns generated as words of the language L can be read as 
arbitrary but unambiguous representation of data types, e.g. as letters, numbers, 
alphanumeric characters or character strings. Reading them as the binary 
representation of numbers, then the above words are in decimal representation: 
15 a) 0 {empty} 

b) 1 

c) 2 

d) 32. 

Explanation of II: The NFR method for manufacturing monomer sequences 
20 The NFR method (Niehaus-Feldkamp-Rauhe method) is a method for 

manufacturing monomer sequences (oligomers and polymers, e.g. nucleic acids) that 
ensures that the monomer sequences manufactured with this method: 

a) have an unambiguous and unique sequence of monomers; 

b) are as dissimilar to one another as possible, and therefore if possible do not 
25 undergo failed hybridizations; 

c) have certain structural properties, such as a certain ratio among the different 
monomers, the containing or non-containing of certain partial sequences and a 
maximum matching (homology) among one another. 

d) have certain chemical properties, such as the occurrence of certain chemically 
30 active partial sequences that influence chemical reactions, in the case of nucleic 
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acids in particular the interaction with proteins (protein binding locations, 
restriction cut sites, stop codons). 
e) have certain physical properties, such as a certain melting temperature, for 
example. 

5 

The NFR method allows the production of unambiguous monomer sequences 
that are as dissimilar as possible and have predefinable, structural, chemical, and 
physical properties. It is suitable for the production of monomer sequences for 
controlled chemical reactions, such as are necessary for molecular information 

10 processing, for example. It is used in Step m of the inventive method for the 
implementation of regular grammars. 

The NFR method is an inventive improvement of the Niehaus method for the 
generation of unambiguous sequences that are as dissimilar as possible, which is 
described in [Jens Niehaus, DNA Computing: Assessment and Simulation, Master 

15 Thesis Paper at the Faculty of Computer Science of Dortmund University, Department 
XI, 116-123, (1998)], which is incorporated by reference herein. 

The NFR method is an inventive improvement of the Niehaus method in that 
only the NFR method allows the manufacturing of monomer sequences as are necessary 
for the production of monomer sequences in vitro, for example to implement grammars. 

20 The reasons for this are: 

a) The Niehaus method describes only the construction of sequences from base 
sequences of a length of 6. Since the length of basic frequencies describes the 
maximally possible overlaps between different monomer sequences, and thus 
directly influences the hybridization behavior of sequences as well as the 

25 success of the inventive Steps IV and V, the method has to be generalized for 

base sequences of any length, which is achieved by the NFR method. 

b) The Niehaus method allows only the generation of sequences of equal length, 
whereas the NFR method can generate sequences of different lengths and 
different numbers. This last aspect is essential for correct hybridization 

30 behavior of sequences as well as the success of the inventive Steps IV and V. 
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The Niehaus method ensures unambiguity and maximum dissimilarity of 
sequences only for sequences that have been generated by a one-time 
application of the method. However, it is imperative that compatible, that is, 
unambiguous and maximally dissimilar sequences can be generated also for 
existing sequences. In contrast to the Niehaus method, the NFR method can 
generate compatible (that is, unambiguous and maximally dissimilar) new 
sequences for any given sequence. Herein, the method can be applied any 
number of times to the same set of sequences. 

The Niehaus method limits the maximum length of common partial sequences, 
however it does not limit their number. Therefore, two arbitrary sequences 
constructed according to the Niehaus method may contain several common 
partial sequences, which can unintentionally lead to a high homology 
(sequence matching). The homology then has direct influence on the 
hybridization behavior of sequences and the success of the inventive Steps IV 
and V. The NFR method, on the other hand, also carries out a homology 
comparison during the construction of sequences and thus avoids unintentional 
failed hybridizations. 

The Niehaus method does not allow the integration of certain prescribable 
sequences and partial sequences. However, such sequences are absolutely 
necessary for the execution and control of certain chemical reactions, such as 
controlled enzymatic interaction. In the case of nucleic acids, examples of 
such sequences are protein binding locations, restriction cut sites and stop 
codons. On the other hand, the NFR method allows the integration of any 
sequence and partial sequence into the production of monomer sequences and 
simultaneously ensures unambiguity and maximum dissimilarity. 
The Niehaus method does not provide the possibility to generate sequences 
with certain structural, chemical, or physical properties. On the other hand, 
the NFR method includes the possibility of generating sequences with certain 
structural, chemical, and physical properties. This also includes the ratio 
among different monomers and the melting temperature. This is also a 



w 19 

necessary condition for the implementation of grammars in vitro. 
The Niehaus method does not allow a direct generation of rule-representing 
sequences, as are necessary for the implementation of grammars. On the 
other hand, the NFR method allows the generation of symbol-representing 
sequences, which can be linked to rule-representing sequences that are 
necessary for the implementation of grammars. 

Only with the NFR method, sequences can be constructed such that the 
properties of unambiguity and maximum dissimilarity as well as structural, 
chemical, and physical properties are also valid for partial sequences. For 
example, in the case of nucleic acids, monomer sequences can be manufactured 
such that the entire sequence as well as for example the first third of the 
sequence has a 50% content of GC. This property is a necessary condition for 
example for the successful concatenation of sequences that are designated to 
several hybridization events per molecule (such as for example the algomers 
described below). Only in this manner can the sequences be constructed, for 
example, such that the different hybridization events per molecule have equal 
probability. 

Only the NFR method allows the direct, automatic production of monomer 
sequences, for example, with an oligonucleotide synthesizer. 
When linking the sequences together to sequences of longer chains, the 
uniqueness may be violated. For the correct manufacture of polymers by 
linking oligomers (such as the linking of algomers to logomers), it is absolutely 
necessary that the uniqueness violations only occur in a predefined area, so that 
no uncontrolled failed hybridizations occur (see Fig. 21). This problem 
occurs in general and must be solved in particular for the linking of algomers 
and the linking of variables and terminals, because otherwise, a translation of 
grammars into molecules is impossible. The Niehaus method does not solve 
this problem and is therefore not suited for the translation of grammars into 
molecules. The problem is solved by a "parallel extension" strategy, using the 
NFR method. This is explained in the following. 
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Monomer sequences are produced by constructing and synthesizing them with 
the NFR method. To manufacture n monomer sequences, the NFR method is 
carried out as described in the following, using the following abbreviations: 



5 A 



Si 



10 lseq 

Sseq 

Sseq,k 

lbas 



15 



Sba 



20 lov/1: 



ov A seq 



lbas^lseq 

25 M seq 
Mbas 
Mnobas 

IMnobasI 

30 n 



alphabet of the potency a e IN; 

in case of DNA, A := {a, c, g, t} and a = 4. 

sequence. 

series of elements of the alphabet A, which corresponds to a „ 
sequence of monomers. 

length of sequences to be constructed per method cycle, 
sequence of the length l seq . 

(k-lf 1 monomer of the sequence S seq (k e #V,k ^ l seq ). 
length of the base sequences used for the construction of 
sequences per method cycle. 

0 < lbas = lseq- 

sequence of the length l bas = base sequence; sequences of the 
length lseq are constructed from base sequences, 
maximum length of chain of subsequent monomers that are 
shared by two sequences generated by the method ("overlap")- 
ratio of maximally allowed sequence repetition to total length of 
sequence; 

1 st measure for probability of failed hybridization, 
ratio of base sequence length to total length of sequence; 
2 nd measure for probability of failed hybridization, 
set of Sequences. 

set of all base sequences of length lbas- 

subset of the set of all base sequences of length lbas that is 

obtained by decomposition from M seq . 

potency of Mnobas = number of base sequences in Mnobas. 

number of sequences to be constructed per method cycle. 
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n tot ;= total number of sequences produced in all method cycles, 
nmax := maximum number of sequences that can be constructed per 
cycle. 

h := homology. Measure for the matching between two 

5 sequences. 

0 ^ h ^ 1. 

t m := melting temperature. For a DNA sequence, the melting 

temperature is determined according to the nearest neighbor 
method (see Breslauer, K.J., Frank, R:, Blocker, H., Marky, 
10 L.A., Proc. Natl. Acad. Sci., 83, 3746-3750, 1989): 

tm = AH/(AS + R x M(C/4)) - 273.15°C. 
Herein, AH and AS are enthalpy and entropy of the DNA helix, 
R is the molar gas constant and C is the concentration of the 
DNA sequence. 

15 The production method is carried out as a series of any arbitrary but finite 

number of method cycles, in which n tot sequences are constructed, and a subsequent 
synthesis, in which all n t0 t sequences are generated in vitro. 

In accordance with the definitions listed above, the maximum number nmax of 
sequences that can be constructed per method cycle can be determined by 

20 nmax = f(a, lbas, lseq) := L(l/2*((a lbas )- (a lbas/2 )-IM n obasl)) / (WW)-!. 

This means it is possible to construct maximally n max sequences of the length 
l seq from the elements of an alphabet A of size a (in the case of DNA, a = 4), that match 
in maximally l ov contiguous chains of monomers. The number of sequences actually 
obtained per method cycle can be smaller when certain physical, chemical, or structural 

25 properties are desired, or when further sequences are produced in addition to already 
existing sequences. 
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In particular, the following method steps are necessary: 

1 n^ sequences are constructed in z method cycles and collected in a set M seq . 

2 Prior to the method cycle, any unambiguous sequence can be added once to the 




sequence set M seq . This can be useful in order to add certain sequences, which should 
be contained in the set of manufactured monomer sequences because of the further 
application. It is recommended that not too many and not too long sequences are 
added, because the method does not guarantee the same properties for these sequences 
5 as for the sequences constructed in the following. 

3 Start of the method cycle is a view of The structural, chemical, and physical 
properties that the sequences to be produced must fulfill are defined. The structural 
properties are at least containing or non : containing of certain partial sequences (if 
necessary, also the positions, at which the partial sequences in the sequences to be 

10 manufactured should be contained or not contained) and the ratio of the number of 
different monomers to one another (in the case of DNA: ratio GC/AT). The chemical 
properties are at least the occurrence of certain partial sequences that influence 
interaction with their own or other substances (in the case of DNA e.g. certain protein 
binding sites, restriction cut sites such as 5'gatatc3' for EcoRV, the stop codons 5'tca3\ 

15 5'tta3', 5'cta3', etc.). The physical properties to be determined include at least the 
melting temperature t m of the sequences to be manufactured. 

4 The number of n < n^ of sequences desired per cycle is selected in 
accordance with the above formula l seq of the sequences to be constructed is selected; 
and lbas of the base sequences necessary for construction is selected. l seq and l ba s are 

20 chosen in accordance with the above-mentioned formula such that the number n of 
sequences to be constructed can be reached. Since IbaAeq is proportional to the 
probability of failed hybridization, l se q and l bas are chosen such that the ratio IbaAeq is as 
small as possible (values below 0.3 are preferable) and the values for l seq guarantee good 
hybridization conditions. For DNA, any values for l seq > 0 are possible, depending on 

25 the temperature. 

5 A set Mbas of all sequences of the alphabet A of length l bas is generated. The 
sequences generated in this manner are called base sequences. There are always a lbas 
base sequences. The base sequences are used to construct the sequences to be 
generated. Each base sequence is assigned one of the states "used" and "unused." 

30 Initially, all base sequences are marked as unused. 




6 Self-complementary base sequences of M bas are now marked as used. There 
are exactly a lbas/2 base sequences that are self-complementary. 

7 If there are already sequences in the set M seq , then sequences compatible 
thereto can be constructed or the sequences of a subset of M seq can be elongated. For 

5 this, in a decomposition process, a set M n0 bas of all partial sequences with the length l bas 
prescribed in Step 4 is formed of the sequences already available in M seq : 

Set Mnobas = { }- 

Every sequences S seq of M seq is now decomposed in (l seq - lov) 
decomposition steps. 
10 Start with i = 0 with the decomposition step: 

While i < (lseq - lov): 

Form a new sequence S new as a sequence of the 
length lbas, including the monomers S se q,i to S seq,i+lbas • 

Snew Sseq,i >*•*? Sseq,i+lbas - 

15 Add Snew to the set M n0 bas. 

Seti :=i + 1. 

The resulting set M n0 bas is by definition a subset of M bas . 
8 All base sequences of the set M bas that also occur in M n obas are marked as used, 

so that for the construction of sequences exactly (a lbas - a 1 
20 sequences are left over, of which maximally 
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different new sequences can be constructed that have no base sequences or 
25 complements thereof in common with one another and with already present sequences. 
9 From the base sequences, a directed graph is constructed whose nodes 

represents certain base sequences: The graph contains exactly a lbas nodes, of which 
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already (a lbas/2 + IM n0 basl) are marked as nodes. Each node is associated with a base 
sequence S b (i), that is not complementary to itself (in the following, nodes and their 
associated base sequence are not distinguished). 

An edge from S b (i) to S b (k) exists, if the l ov last letters of S b (o correspond to the 
5 lov first letters of S b <k), that is, if: 

S b (i),2 S b (i),i ba s = S b (k),l >■■-> Sb(k)Jbas-l. 

The node S b (k) is called the successor node to S b (i>. 

The node that is associated with the complement of the base sequence that is 
coded by the node S b (o is called the complementary node to S b (i>. 
10 Sequences of length l seq are found by looking for a path with (l seq - lov) nodes. 

No node may be used more than once. Moreover, for each node that is in one of the 
paths, the complementary node may not occur in any path. The starting node of the 
path carries, like the sequence, a state of "marked" or "unmarked." 
10 Sequences are constructed from the graph by the? following method: 
15 Mark all unused nodes of the graph as unused starting nodes. 

For every s between 1 and n: 

As long as there are nodes S b ( k ), that are not yet 
marked as used starting nodes: 
Select an unused node S b (k>- 
20 Mark node S b (k) as used starting node; furthermore: 

Mark sequence Sbao and its complement as used. 
Now a new sequence S new is constructed, that is a 
new element in M seq or elongates an existing 
sequence M SU b of M seq : 
25 Set S new ,o := S b <k),i. 

If constructed as new sequence, set i := 0. 
If constructed as elongation of an existing sequence 
S SU b> then set i := l su b- 
While i< l seq - (lov - 1) and i = 0: 
30 If no unused successor node S b ( m > to node 
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Sb(k) exists, then mark node Sb(k) and its 
complementary node as unused and set i := 
i-1. 

Else select randomly an unused successor 
5 node Sb(m) of node Sb<k) and mark b m and its 

complementary node as unused. In 
addition, set: i := i + 1, k := m, and S ne w,i := 

Sb(m),l- 

If i = lseq - Oov - 1), then the sequence s ne w is 
10 finished: it includes the letters S ne w,o to 

Snew,(lseq-lov) . 

11 The structural, chemical, and physical properties of each of the sequences 
obtained in Step 10 are determined in comparison to the requirements defined in Step 3. 
If a sequence does not conform with the preset requirements, then it is deleted, and the 

15 nodes and complementary nodes used by it are again marked as unused. 

12 If the number of constructed sequences is not sufficient, the method cycle can 
be repeated, as often as necessary, from Step 3 or 4 onward. In particular, it is possible 
to set the structural, chemical and physical criteria and the values for n, l seq and lbas new 
for each method cycle. Thus, it is possible to construct sequences with different 

20 structural, chemical, and physical properties, as well as different length, that are still 
compatible, which means unambiguous and maximally dissimilar. 
Else, go to the next step, the in vitro synthesis. 

13 The constructed sequences are synthesized in vitro, and in the case of nucleic 
acids, preferably using an oligonucleotide synthesizer (e.g. ABI 392, ABI 398, ABI 

25 3948 by Perkin-Elmer Applied Biosystems). For this, the sequence data are 
communicated to the control unit of the oligonucleotide synthesizer and the sequences 
are manufactured as single-stranded nucleic acids. The sequences can also be ordered 
commercially. PAGE-purified oligonucleotides (e.g. obtainable at ARK Scientific 
GmbH Biosystems, 64293) are suitable. 

30 In order to translate grammars into molecules, the above-mentioned problem of 
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uniqueness violation when linking sequences must be solved (see Fig. 21). This 
problem cannot be solved with the Niehaus method, because the Niehaus method does 
not include the linking of sequences. In particular, the Niehaus method is not potent 
enough for the solution of the following partial problems: 
5 • The linking of sequences can lead to a uniqueness violation (see Fig. 21). 

• When linking variables and terminals, the sequence of a variable can link to 
more than one terminal sequence per end and vice versa (see Fig. 22 and Fig. 
24), depending on the grammar. 

• When linking e.g. several terminals with the same variable, uniqueness 
10 violations can occur, that cannot be resolved by a simple path search (see Fig. 

23). 

The problem is solved by a strategy referred to as "parallel extension", using the 
NFR method. More specifically, this is done as follows (illustrated for the 
construction of variables to predetermined terminals, see Fig. 22 to Fig. 24): 
15 LA group of sequences (here: the terminals, e.g. sequences for representing bits or 
sequences with certain chemical or biological properties) is given. 

2. Collect for each variable all pairs of terminals whose sequences will frame the 
variable sequence in the logomer and group them together, one group for each 
variable (see Fig. 22 and Fig. 24). Arrange the paths of the terminal pairs such that 

20 a gap remains between each pair, and the length of the gap corresponds to the 

variable path lengths (l seq - W + 1) plus l bas - 1 for each of the two transitions. 
Herein, a transition is the path of the length l bas - 1 that connects terminals and 
variables that belong together. In Fig. 21, the transition consists of the marked 
base sequences, for example. Transitions can converge (e.g. the terminals a, b c 

25 converge to the variable A in Fig. 22) or branch (e.g. the variable A branches to the 

terminals b, c and d in Fig. 22). 

3. Select the last base sequence of each terminal sequence on the left side as the 
starting node for the corresponding path (see Fig. 24). 

4. Generate the terminal-variable-transitions simultaneously, that is, find, in each 
30 iteration step, for all paths a successor node whose last monomer is the same for all 
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paths of a group. If in this step no unused and allowed node can be found, then 
backtracking is triggered. If this is not possible, then the translation of the 
grammar into algomers fails at this point and the translation must be repeated with v 
modified (preferably less restrictive) parameters. However, the multiple use of a 
5 node in an iteration step is permissible for the transitions that belong together (see 

Fig. 23 and Fig. 24). 

5. Generate also the variable sequences simultaneously as an elongation of the 
transitions. 

6. Generate the variable-terminal-transitions analogously as described in Fig. 4. 
10 Herein, the successor cannot be selected freely, but is given by the first l bas - 1 

nucleotides of the right terminal sequence. The path search follows this condition, 
in order to find out, whether the paths can be completed. 

This method, referred to as "parallel extension" makes it possible to create an 
interface between a computer and the steps executed in vitro, starting with the 
15 synthesizing of oligonucleotides, and to completely automate the steps from the 
definition of the grammars up to the manufacturing of molecules in vitro. 
Explanation of HI: Implementation of regular grammars with the NFR method 

The production of monomer sequences to represent the set of rules of a 
grammar in Step EI of the method of the present invention is carried out using the NFR 
20 method (Niehaus-Feldkamp-Rauhe method) according to Step II of the method of the 
present invention. To implement a grammar, exactly 2r monomer sequences are 
produced with the NFR method for r rules of a grammar G, so that the monomer 
sequences for the s represented symbols (terminals and variables) and the rules R of the 
grammar G are unambiguous and as dissimilar to one another as possible, and have the 
25 demanded structural, chemical, and physical properties. 

For this, the monomer sequences are produced according to the NFR method, 
so that they can be assembled to algomers, as described below. Two monomer 
sequences each are assembled to an algomer, which represents exactly one rule R of a 
grammar G. The two monomer sequences both contain sequences that contain the 
30 symbols (terminals or variables) that belong together and are required by rule R. 




The design of the oligomers according to Step IE of the method of the present 
invention depends on the chemical properties of the employed monomers. In the 
preferable case of oligomers made of nucleotides,, the following procedure is carried 
out: 

5 Algomers are double-stranded and have a 5' -strand {upper strand) and a 

3' -strand {lower strand). Upper strand and lower strand can be the link for, 
respectively, one terminal sequence and one variable sequence. 

X, Z are any variables, S is a variable functioning as the start symbol, and y 
and z are terminal symbols. Then, for each rule of the form: 

10 a) S:=yX 

an algomer is constructed contains an unambiguous double-stranded core sequence y, to 
which an unambiguous single-stranded overhang sequence X is appended at the 3' end; 
":=" in this case means that S and yX are identical, that is, yX functions as the starter 
molecule. 
15 b) X-^yZ 

an algomer is constructed that, that contains an unambiguous double-stranded core 
sequence y, to which an unambiguous single-stranded overhang sequence X is appended 
at the 5' end, and an unambiguous single-stranded overhang sequence Z is appended at 
the 3' end. X can be identical with Z. 
20 c) X -»z 

an algomer is constructed, that contains an unambiguous double-stranded core sequence 
z, to which an unambiguous single-stranded overhang sequence X is appended at the 5' 
end. 

Algomers of the form S := yZ and X ^ z are called terminators ("start", 
25 "end"), because they lead to chain break-off during polymerization in the linking step V 
according to the present invention, which is explained below; algomers of the form X 
yZ are called elongators, because they lead to a chain elongation during the 
polymerization in the linking step V according to the present invention. * 

Preferably, the monomer sequences produced in accordance with Steps II and 
30 m of the method of the present invention include nucleotides, in particular 



ribonucleotides, and most preferably deoxyribonucleotides. The monomer sequences 
constructed in Step n of the method of the present invention preferably contain for 
example certain sequences, such as stop codons, recognition sequences for enzymes 
cleaving nucleic acids (restriction nucleases), recognition sequences for proteins 
binding nucleic acids, such as described in [J. Sambrook, E.E Fritsch, T. Maniatis, 
Molecular Cloning, A Laboratory Manual, (1989)], [Rolf Knippers, Molekulare Genetik, 
Georg Thieme Verlag, (1997)] and [Benjamin Lewin, Genes V, Oxford University Press, 
(1994)]. 

Explanation of IV: Assembly of algomers 

The assembly of the sequences synthesized in Step HI is carried out in Step IV 
of the method of the present invention. In the context of the present invention, all 
techniques for assembling oligomer sequences known to a person skilled in the art can 
be applied to carry out Step IV. In the case of using nucleotide sequences, which is 
preferable in accordance with the present invention, the following procedure is suitable: 

a) The single-stranded sequences belonging to elongators are phosphorylized. 
For this, several protocols known to a person skilled in the art are possible, for example 
the following: In a 20 jxi preparation, 16 |il of a synthesized 100 |iM sequence, 2 pi of 
a ligation buffer (e.g. 50 nM Tris-HCl pH 7.5, 10 mM MgCl 2 , 10 mM dithiothreitol, 1 
mM ATP, 25 ]Ltg/ml BSA Bovine Serum Albumin, New England Biolabs) and 2 jxl PNK 
(polynucleotide kinase, e.g. by New England Biolabs, Catalogue No. #201 S or #201 L) 
are incubated for one hour at 37°C. The volumes, incubation time and incubation 
temperature can be varied in a manner known to a person skilled in the art. 

b) In one preparation, the single strands (upper and lower strand) belonging to one 
algomer are hybridized to one complete algomer. For elongators, it is possible to 
simply use the phosphorylation preparations of upper and lower strand. For the 
hybridization, several protocols are possible; it is preferable to use a denaturation step 
of about 95°C at the beginning (which also deactivates the PNK in the preparations of 
the elongators) and a slow hybridization. For example, for oligomers of length 30, it is 
possible to use the following protocol, which can be varied in accordance with the 
knowledge of a person skilled in the art: 
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In a 40 (il preparation, 20 (il of the upper strand (100 pM) and 20 jol of the lower strand 
(100 |iM) are heated in a thermal cycler for 5 minutes to 95°C, incubated for 5 minutes 
to 72°C, and then cooled at 1°C per minute for 25 minutes. 

Explanation of V: Manufacture of logomers by concatenating algomers: Symbol 
5 polymerization 

In Step V of the method of the present invention, algomers are linked to 
information-carrying polymers (logomers). Since the algomers represent a set of rules 
R of a grammar G, all generated logomers correspond to words of the language L(G), 
that is described by the grammar G. 
10 The regulated process of concatenating algomers to logomers is referred to as "symbol 
polymerization", and is also called "bit polymerization" if the terminal symbols 
represent bits, that is, if: 

£ := {0, 1, so, si, S2, Sn-i, s n , eo, ei, e2, e m -i, e m } withn, m e W, 
and n, m > 0 

15 In the context of the present invention, all techniques for linking oligomers into 

polymers that are known to a person skilled in the art can be applied to carry out Step V. 
In the application of oligonucleotides, which is preferably in accordance with the 
present invention, it is suitable to incubate the algomers to be linked under the presence 
of ligase. For example, it is possible to follow the following protocol, which can be 
20 modified in accordance with the knowledge by a person skilled in the art: 

In a 27 \il preparation, 1 jlU of 50jiM "start" algomer, 1 pi of 50|llM "end" 
algomer, and 2x each of 10 |al of 40|iM elongator algomers (obtained by Step IV of the 
method of the present invention), 1.5 julI of lOmM rATP and 3.5 jlQ of T4 DNA ligase 
with 400 NEB units/|Lil (e.g. Catalogue No. #202S and #202L NEB) are incubated 
25 between 4°C and 25°C for 24 hours. 

Other reaction volumes work analogously. Incubation time and incubation 
temperature can be modified as known to the person skilled in the art. 
Fig. 3 shows the result of such a symbol polymerization. 

In principle, it is possible to link any number of algomers that are not 
30 terminators. For each individual logomer, the polymerization process comes to a 
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standstill when the corresponding molecule carries an algomer at its respective ends, 
that functions as terminator molecule ("start," "end"). 
Isolation and amplification of logomers by cloning 

A further object of the present invention is a process for isolating and 
amplifying information-carrying polymers obtained by one of the previous methods by 
ligating the information-carrying polymers obtained in Step V in cloning vectors, 
transforming competent cells with these vectors and selecting the successfully 
transformed bacteria by selection markers. 

The logomers obtained from Step V of the method of the present invention as 
described above (symbol polymerization) can be cloned for isolation and amplification. 

For this, the used cloning vector is cut open by restriction enzymes, and a 
logomer is ligated into it. This^ method ensures that only completely polymerized 
logomers (provided with terminators) can be ligated with the target vector. Logomers 
that are correctly ligated into the target vector again form a ring-shaped molecule. The 
isolation of logomers is carried out by the transformation of bacteria with the molecule 
mixture that is the result of the ligation. The vectors carry selection markers 
(preferably antibiotics resistance, such as ampicillin resistance), with which transformed 
bacteria can be selected successfully. Since each bacterium expresses only one 
plasmid, each successfully transformed bacterium is the carrier of exactly one logomer. 
Logomers cloned in this manner can then be amplified in a simple manner and in large 
quantities for further use (on solid or in liquid media or fermenters). 

The logomers obtained by Step V of the method of the present invention are 
cloned for the purpose of isolation and amplification in any cloning vector (plasmid) 
that carries restriction cut sites that are compatible to the sequence overhangs of the 
terminators of the logomers (e.g. HindHI, BamHI recognition sequences in the cloning 
vector pBluescript II KS +/-, Stratagene, Catalogue No. #212207). The cloning is 
carried out in accordance with the usual laboratory protocols, such as described in [J. 
Sambrook, E.F. Fritsch, T. Maniatis, Molecular Cloning, A Laboratory Manual, (1989)]. 
For example, the following protocol can be used, and modified in accordance with the 
knowledge of a person skilled in the art: 
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a) restriction and preparation of the cloning vector 

For cloning, it is possible to use any cloning vectors that are conventionally 
used for biomolecular processes. (A suitable cloning vector is for example the plasmid 
pBluescript H KS +/-, Stratagene, Catalogue No. #212207). The plasmid is subjected 
5 to a restriction digestion with two restriction enzymes that generate overhangs that are 
compatible to the terminators of the logomers. For the restriction enzymes, it is 
possible to use BamHI and HindlE (e.g. by NEB, New England Biolabs, Catalogue No. 
#136S and #104S). For this, one of the usual restriction protocols can be used, which 
can be modified in accordance with the knowledge of the person skilled in the art, such 

10 as the following: 

In a 50 |Ld preparation, 30 [il plasmid (0.7 |ig/|il), 5 |Ld of lOx buffer (e.g. NEB2, 
New England Biolabs), 5 |Ld of BamHI (100 units), 5 |ol Hindffl (100 units) and 5 jol 
BSA lOx are incubated for 1-2 hours at 37°C. For control, 1 (il of the preparation as 
well as the corresponding amount of uncut plasmid is separated electrophoretically on a 

15 1% agarose gel (e.g. UltraPure™ by Gibco BRL, Life Technologies, Catalogue No.: 
15510-027). 

For the preparation of the DNA, the restriction preparation is phenolized, the 
DNA is precipitated and taken up again: 

• refill preparation to 200 jul with distilled water 

20 • add 200 jil phenole, vortex, and centrifuge at lOOOOg for 3 minutes 

• collect supernatant and add 200 jlxI chloroform, vortex, and centrifuge at 
lOOOOOg for 3 minutes 

• collect supernatant, acidify with 1/10 Vol. 3M NaAc, and mix with 2.5x volume 
100% EtOH, vortex, for at least 15 minutes to -70°C 

25 • centrifuge for at least 15 minutes to lOOOOg, collect, and discard supernatant 

• wash with ca. 500(il of 70% EtOH, centrifuge for 5-10 minutes to lOOOOg, 
collect supernatant, and discard 

• dry the pellet and take it up again in distilled water, so that the cut plasmid is 
present at 100 ng/|il to 1 jLtg/jLtl (higher concentrations are also possible). The 

30 plasmid can be diluted for further ligations if necessary: 




b) Ligation of the logomers into the cloning vector 

The logomers obtained in Step V of the method of the present invention and 
the prepared cloning vectors are ligated. The manner of the preparation ensures that, if 
possible, only the desired ligations take place: The terminators of the logomers carry 

5 two different overhang sequences, which are compatible to the overhang sequences of 
the cloning vector that were created by restriction. Thus, the logomers can ligate only 
in a predefined direction into the cloning vector. Furthermore, the terminators of the 
logomers are not phosphorylized, so that the logomers can ligate exclusively with the 
overhang sequences of the cloning vector. The ligation is carried out in accordance 

10 with one of the usual ligations protocols, as described for example in [J. Sambrook, E.R 
Fritsch, T. Maniatis, Molecular Cloning, A Laboratory Manual, (1989)], which can be 
modified in accordance with knowledge of a person skilled in the art, for example as 
follows: 

The molar ratio of logomers/cloning vector can be e.g. 500. In 10 |ol total 

15 volume: 

1 jj! plasmid (10ng/|il = 5nM) 

0.5 jliI of 4 |LiM logomers of Step V of the method of the present invention 
1 (il of lOx ligation buffer 
0.5 |il of T4 DN A ligase (400u/|Ld) 
20 7 (ol H 2 0 

are ligated for 12 hours to 16°C. 

The protocol be modified in accordance with the knowledge of the person 
skilled in the art. For example, it is possible to use a protocol for TCL (Temperature 
Cycle Ligation, [Lund, A.H., Duch, M., Pedersen, F.S, Increased cloning efficiency by 
25 temperature-cycle ligation, Nucleic Acids Research, 24:(4)], 800-801 , (1996)). 

The resulting ligation preparation is now used for the transformation of 
competent cells, as shown in the example below: 
c) transformation of competent cells 

The transformation of bacteria (competent cells, host: e.g. dH5oc, GIBCO BRL, 
30 Catalogue No.: 18258-012) with the ligation preparation of b) is carried out with one of 
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the usual protocols, as described for example in [J. Sambrook, E.E Fritsch, T. Maniatis, 
Molecular Cloning, A Laboratory Manual, (1989)], which can be modified in 
accordance with the knowledge of the person skilled in the art, for example as follows: 

• 200 jxl competent cells (~5xl0 7 CFU = colony forming units, stored at -70°C) 
5 are thawed and put on ice 

• about lng of the ligation preparation of b) is pipetted to it and left for 20-30 
minutes on the ice, while carefully mixing every 5 minutes 

• heat shock: heat the preparation for two minutes to 42°C, then back onto the 
ice 

10 • add 0.8 ml LB medium (+0.02M MgS0 4 +0.01M KC1) 

• roll preparation for 20-60 minutes in test tube 

• plate 0.1 to 1ml onto agar plate (with antibiotic, e.g. ampicillin) and to 37°C 
over night. 

The transformation functions as a method for selecting successfully cloned 
15 logomers and as a method for isolating individual logomers, because each bacterial cell 
expresses exactly one plasmid. 

In addition to the above-mentioned method of cloning, which is preferable in 
accordance with the present invention, the information-carrying polymers (logomers) 
obtained in accordance with Step V of the method of the present invention can be 
20 isolated and amplified with the following method: 

Isolation by hybridization and chromatographic methods 

Herein, the molecular mixture resulting from a symbol polymerization is 
applied on a carrier material, to which oligomers ("anchor sequences") are bonded, 
which can bond to the terminators of the logomers. The bonded logomers are then 
25 taken up individually and can be amplified as needed. 

Herein, several methods are applicable: 
a) affinity chromatography; herein, anchor sequences that have ends that are 
compatible to the overhanging ends of the terminators are bonded e.g. to a 
hydroxylapatite column. The logomers obtained by Step V of the method of the 
30 present invention are applied over the column, whereby the logomers can bond to the 




anchor sequences by hybridization. The logomers bonded in this manner are then 
taken up individually. 

b) anchor sequences are bonded covalently to a membrane (e.g. Gene Screen Plus, 
DuPont, Biotechnology Systems; Hybond-N, Amersham Life Sciences). The 
5 membrane has a grid, which means that it is partitioned into individual fields. Exactly 
one anchor sequence per field is bonded covalently to the membrane. Then, the 
logomers obtained in Step V of the method of the present invention are applied onto the 
membrane and hybridized with the anchor sequences. The membrane is then cut into 
the individual fields of the grid. The individual fields, which now contain maximally 
10 one logomer bonded to the corresponding anchor sequence can now be introduced into 
separate vessels (such as Eppendorf tubes). Then, the logomers can be separated by 
denaturing from the membrane. When denaturing, it is important to select a 
denaturing temperature that it is high enough to separate logomer and anchor sequence, 
but not so high that the logomer itself melts completely. 
1 5 Isolation by Dilution 

In isolation by dilution, the logomers that are obtained as a mixture of a symbol 
polymerization are diluted so much, that the target volume statistically contains exactly 
one molecule. This molecule can then be detected and amplified by PCR. The 
dilution of the original volume into target volumes can be carried out concurrently, so 
20 that an original volume is diluted into n target volumes. 

In accordance with the present invention, this is carried out as follows: 
For a mixture of logomers, obtained by Step V of the method of the present invention, 
in an original volume V a , a dilution factor is determined, that dilutes n logomers 
contained in V a such that each target volume V z statistically contains exactly one 
25 logomer. In order to determine the dilution factor, an aliquot of the original volume 
(e.g. 1 |il) is titrated in a dilution series. Then, for each dilution in the series, an 
aliquot (e.g. 1 (il) is used as a template in a PCR reaction, in order to determine, in what 
dilution the logomers are still detectable. Thus obtaining the last dilution containing 
logomers and the dilution that already contains no logomers, it is possible to specify a 
30 rough dilution factor that contains just about one logomer. If necessary, this dilution 
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factor can be determined more precisely by titrating the last dilution that still contains 
logomers. For this, smaller dilution steps are used (for example, if the first dilution 
' series was diluted at 1 : 10, then the second dilution series can be diluted at 1 : 2; in 
this manner, the method for precise determination of the dilution factor can be carried 
5 out in principle at any desired precision). 

Since the PCR of the dilutions is for determining whether there are logomers 
left in the dilution, the PCR can be carried out in two ways: 

a) Two anti-sense primers priming in the terminators, e.g. the 5' strand of the start 
terminator and the 3' strand of the end terminator, are taken as primers. Otherwise, the 

10 PCR conditions are as described below in Amplification of logomers by PCR. 

b) The PCR is carried out like the PCR for reading out logomers, but a 
preparation with a pair of primers as described under Readout of logomers by PCR 
below is sufficient. 

For control, the DNA fragments obtained by PCR are visualized by gel 
15 electrophoresis. For this, e.g. a 2 - 4% agarose gel (e.g. UltraPure™ by Gibco BRL, 
Life Technologies, Catalogue No.: 15510-027) and as the molecular weight standard e.g. 
a 50bp ladder (Gibco BRL, Life Technologies, Catalogue No. 10416-014) are used, 
The resulting gel is colored for 5 minutes in 0.001% ethidium bromide and made visible 
under UV light. 
20 Amplification of logomers 

The logomers obtained by a symbol polymerization can - depending on the 
intended application - be amplified. This may be necessary in order to visualize the 
stored information or to further use the logomers as markers for the labeling of 
substances and objects. 
25 Amplification of logomers by PCR 

With this method, logomers are amplified by PCR. The logomer to be 
amplified serves as a template, and the necessary primers prime in anti-sense in the 
terminators (either 5' start and 3' end, or 5' end and 3' start), so that the logomer is 
amplified completely. Alternatively, the primers can also prime outside of the 
30 logomers, if the logomer to be amplified is surrounded by further DNA. 
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For PCR preparations that can be varied in manners known to a person skilled 
in the art, the following reagents and conditions are used, for example: 

dNTPs (e.g. Pharmacia Biotech, Catalogue No.: 27-2035-01/2/3), Taq 

polymerase (e.g. Gibco-BRL, Catalogue Nos.: 10838-034, 10838-042, 18038-067), 

PCR buffer, MgCl 2 (e.g. GIBCO-BRL, Catalogue No.: 18067-017) 

amount (id) 



dNTP, lOmM 
PCR buffer lOx 
MgCl 2 , 25mM 
TAQ, 5u/iil 
primer 1, 10}iM 
primer 2, 10|iM 
template 
(logomer) 
ffcC- 
total volume 



1 

5 
5 

0.5 
1 
1 
1 

35.5 
50 



As the PCR program, it is possible to use the following protocol (primers: 
30-mers), for example: 



goto step number of 
repetitions 



step 


action 


temperature 


duration 


1 


denature 


95°C 


00:05:00 


2 


denature 


95°C 


00:00:30 


3 


annealing 


68°C 


00:00:30 


4 


polymerization 


72°C 


00:00:30 


5 


goto 






6 


cool 


4°C 


(any) 


7 


end 
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10 

A further object of the present invention are information-carrying polymers 
that are obtainable by the methods of the present invention described above. 
Reading of the information contained in the logomer s 

Finished logomers can be read either by PCR (polymerase chain reaction), 
15 which is preferable in the present invention, or by restriction digestion. The PCR 
method has the advantage that even smallest amounts of DNA can be amplified such 
that the logomers can be read directly after the PCR and a gel separation. In the case 




of binary logomers, the band pattern obtained by PCR on the gel can be read 
immediately as binary code. 

Thus, a further object of the present invention is a method for reading 
information of information-carrying polymers, that have been obtained and/or isolated 
5 and amplified as described above, such that 

a) one pair of anti-sense primers each is mixed into n solutions containing the 
information-carrying polymer, wherein n is the number of oligomers contained as 
elongators in the polymer; 

b) at least n-1 PCR processes are carried out, wherein n is the number of 
10 oligomers contained as elongators in the polymer, and one primer of each pair primes in 

the terminator opposite to the elongator and the other primer primes in the elongator 
itself; 

c) the polymer fragments obtained by PCR are separated according to length 
using electrophoresis; and 

15 d) the pattern obtained by electrophoresis is read out optically. 

The method can be carried out using primers or nucleotides that are 
fluorescence marked by various colors. Thus, it becomes possible to mark several 
preparations by different colors and to read several preparations in electrophoretically 
one track. Furthermore, it is thus possible to automate the readout process with 

20 modern sequencing machines (e.g. available by ABI). 
Readout of logomers by PCR 

In order to visualize the information contained in the logomers, it is possible to 
read out the logomers by PCR. The method of reading binary patterns by PCR has 
been described as DNA typing in [Jeffreys, Minisatellite repeat coding as a digital 

25 approach to DNA typing, Nature, 354, 204-209, (1991)]. The method described in this 
article is used to read out logomers in a modified form as readout PCR. The 
differences to the method described in [Jeffreys, Minisatellite repeat coding as a digital 
approach to DNA typing, Nature, 354, 204-209, (1991)] are that here, synthetically 
produced templates are used, that do not necessarily contain binary patterns. 

30 Furthermore, the PCR and gel electrophoresis conditions are selected such that the 
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result can be read directly from the gel, without having to further amplify the signals 

obtained as band patterns by blotting. 

a)PCR 

To read a logomer, at least n - 1 , but usually n PCR preparations are needed for 
5 n elongators of the grammar. Every PCR preparation contains the logomers to be read 
as a template and furthermore a pair of anti-sense primers, of which one primes in the 
elongator, and the other one in the terminator opposite to the elongator (e.g. 5' -"start" 
and 3'-0, see Fig. 5). For the PCR, it is possible to use any commercial thermal cycler 
(e.g. PTC-100, MJ Research). A length of 20-30bp has been found to be suitable as 
10 the length of the primers, but it is also possible to use other lengths. The annealing 
temperature has to be selected in accordance with the length of the primers (and can be 
determined e.g. using the program Oligo 5.0). Suitable annealing temperatures for 
20-mers are about 55°C, and for 30-mers about 65°C to 74°C. 

For the PCR preparations, the following reagents and conditions are used, for 
15 example that can be varied in manners known to the person skilled in the art: 

dNTPs (e.g. Pharmacia Biotech, Catalogue No.: 27-2035-01/2/3), Taq polymerase (e.g. 
Gibco-BRL, Catalogue Nos.: 10838-034, 10838-042, 18038-067), PCR buffer, MgCl 2 
(e.g. GIBCO-BRL, Catalogue No.: 18067-017) 

amount (|il) 



20 



dNTP, lOmM 
PCR buffer lOx 
MgCh, 25mM 
TAQ, 5u/nl 
primer 1, 10nM 
primer 2, 10|aM 
template 
(logomer) 
H z O 

total volume 



1 
5 
5 

0.5 
1 
1 
1 

35.5 
50 



In order to obtain enough DNA to read the resulting band pattern after the gel 
electrophoresis directly from the gel, it is possible to prepare each PCR preparation m 



times. For example, m = 4. 
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As the PCR program, it is possible to use the following protocol (primers: 
30-mers), for example: 

goto step number of 
repetitions 



step 


action 


temperature 


duration 


1 


denature 


95°C 


00:05:00 


2 


denature 


95°C 


00:00:30 


3 


annealing 


69.5°C 


00:00:30 


4 


polymerization 


72°C 


00:00:30 


5 


goto 






6 


cool 


4°C 


(any) 


7 


end 
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After the PCR has been carried out, it may be advantageous for the readability 
5 of the band pattern of the subsequent gel electrophoresis to hold as much amplified 
DNA in as little volume loaded onto the gel as possible. The volumes can be narrowed 
down by any suitable method (e.g. with a SpeedVac, e.g. SpeedVac Concentrator, 
Savant) 

b) gel electrophoresis 

10 The DNA fragments obtained from the PCR for each elongator have different 

lengths. Separating them by electrophoresis, the various length fragments result in a 
specific pattern, by which it is possible to identify the logomers to be read (see Fig. 5 
and Fig. 6). For the gel electrophoresis, it is possible to use a 4% agarose gel (e.g. 
UltraPure™ by Gibco BRL, Life Technologies, Catalogue No.: 15510-027), and as the 

15 molecular weight standard it is possible to use e.g. a 50bp ladder (Gibco BRL, Life 
Technologies, Catalogue No.: 10416-014). The separation of the DNA is performed in 
a gel chamber in any suitable manner, such as 1:45 hours at 60V. However, the 
parameters can be selected freely, in particular, it is possible to shorten the gel run-time. 

The resulting gel is colored for five minutes in 0.001% ethidium bromide and 

20 is visualized under UV light. 

An example of the resulting gel is shown in Fig. 6. 
Reading out logomers by restriction digestion 

Logomers can be read out by restriction digestion. For this, the elongators 
must be configured such that they carry asymmetrically shifted restriction cut sites. 
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Each elongator carries a specific restriction cut site. For example, the restriction 
enzyme EcoRV (e.g. NEB, Catalogue No. #195S) can be used for O-elongators, and 
Smal (e.g. NEB, Catalogue No. #141S) can be used for 1-elongators. 

For the readout, the logomer to be read out is cut in different restriction 
5 preparations. For this, one restriction preparation per elongator is formed with the 
corresponding restriction enzyme. The DNA fragments obtained from the restriction 
have different lengths. When they are separated by electrophoresis, the different 
length fragments result in a specific pattern, by which it is possible to identify the 
logomer to be read out. 

10 The following example shows the DNA fragment lengths of all binary 

logomers with 4 bit without terminators at an elongator length = 30bp; restriction cut 
site per elongator after lObp. 



binary number 


0 


1 




0000 


10, 40, 40, 40, 30 


160 




0001 


10,40,40, 70 


130, 30 




0010 


10, 40, 80, 30 


90, 70 


* 


0011 


10, 40, 110 


90, 40, 30 




0100 


10, 80, 40, 30 


50, 110 


* 


0101 


10, 80, 70 


50, 80, 30 




0110 


10, 120, 30 


50, 40, 70 




0111 


10, 150 


50, 40, 40, 30 




1000 


50, 40, 40, 30 


10, 150 




1001 


50, 40, 70 


10, 120, 30 




1010 


50, 80, 30 


10, 80, 70 




1011 


50, 110 


10, 80, 40, 30 


* 


1100 


90, 40, 30 


10,40, 110 




1101 


90, 70 


10, 40, 80, 30 


* 


1110 


130, 30 


10, 40, 40, 70 




1111 


160 


10, 40, 40, 40, 30 





In the above example (of 4 bits), it can be seen that the length pattern of the 
15 numbers 0010 and 0100 for restriction of the 0-bit is ambiguous. However, it is 
possible to distinguish between 0010 and 0100 using the length pattern for restriction of 
the 1-bit. 

In particular, the above example was carried out as follows, although the 
protocols can be varied according to the knowledge of a person skilled in the art: 



42 



a) DNA extraction 

A bacterial colony obtained e.g. by Isolation and amplification of logomers by 
cloning is used to inoculate 2-5ml of a 37°C overnight culture (e.g. LB medium with 10 
(Xg/1 ampicillin, as described in [J. Sambrook, E.F. Fritsch, T. Maniatis, Molecular 

5 Cloning, A Laboratory Manual, (1989)]. As the cloning vector, it is possible to use for 
example pGEM-T Easy (Promega, Catalogue No. A1360) and binary logomers without 
terminators. The plasmid DNA containing the logomers is isolated from the overnight 
culture (for example using the Qiagen Plasmid Miniprep Kit, Qiagen, Catalogue No. 
12123, or by alkaline lysis as described in [J. Sambrook, E.F. Fritsch, T. Maniatis, 

10 Molecular Cloning, A Laboratory Manual, (1989)] . 

b) Restriction 

The plasmid DNA obtained from a) is cut for n elongators into n restriction 
preparations. Each restriction preparation contains a restriction enzyme that cuts in a 
specific elongator and a restriction enzyme that the logomer cuts from the cloning site. 
15 In this example, the following preparations were used: 





name 


quantity 


volume (|il) 


plasmid 


logomer in pGEM-T 


500 ng/nl 


6 


DNA 


easy (see above) 






buffer 


NEB 2 


lOx 




BSA 




lOx 




enzyme 1 


ECORV 


10u/|il 




enzyme 2 


ECO RI 


10u/|il 


10 


total 








name 


quantity 


volume (|il) 


plasmid 


logomer in pGEM-T 


500ng/(il 


6 


DNA 


easy (see above) 




1 


buffer 


NEB4 


lOx 


BSA 




lOx 


0 


enzyme 1 


Smal 


10u/^l 


1 


enzyme 2 


ECORI 


10u/|jl 


1 


H 2 0 






1 


total 






10 
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The listed preparations are incubated for one hour at 37°C. 
c) Gel electrophoresis 

The DNA fragments obtained by b) are separated electrophoretically (e.g. with 
4% Agarose UltraPure™ by Gibco BRL, Life Technologies, Catalogue No.: 15510-027), 
5 colored with ethidium bromide (0.001%) and made visible under UV light (see Fig. 8). 
The above-described methods enable for example the implementation of binary 
logomers: 

Binary logomers are generated by grammars with: 
£:= {0, l,s 0 , si,s 2 , Sn-i, s n , eo, d, e 2 , ...,e m -i, e m } with n, me 1N 9 and 
10 n, m ^ 0. 

They allow the simplest and most universal representation of data, which is 
also applied to conventional data processing machines. Binary logomers are the result 
of a bit polymerization. Since practically any symbols and rules can be coded 
depending on the chosen grammar, it is also possible to generate any data and data types 
15 with them. 

The above-described methods enable for example the implementation of character 
alphabets: 

Logomers can be used to code the characters of a chosen alphabet. Different 
character lengths can be chosen, but it is advantageous to use character lengths that are 
20 also used in conventional data processing machines (half byte, 1 byte, 2 bytes, 4 bytes, 8 
bytes, etc.). Also the conventions for interpreting the represented characters can be 
selected freely. The characters can be numbers, letters, alphanumeric characters or any 
other data structure. 

The above-described methods enable for example the implementation of a 1-byte 
25 alphabet: 

Given is a grammar G = (X, V, R, S) with the terminal alphabet £ := {0, 1, s, 
e}, set of variables V := {S 0 , Si, S 2 , S 3 , S 4 , S 5 , S 6 , S 7 , S 8 }, start symbol S and a set of 
rules 
R:= 

30 { , 




S := sSo 
So OSi 

50- »lSi 

51- »0S 2 

5 St 1S 2 
S 2 OS 3 

s 2 is 3 
S3 0S4 
S3 -> 1S 4 

10 S 4 os 5 
s 4 is 5 
s 5 -» os 6 

S 5 1S 6 

s 6 os 7 

15 S 6 1S 7 

s 7 -»os 8 

s 7 is 8 

S8 -» e 

} 

20 wherein: 
s := start 
e := end. 
Example: 

With the rules according to the given grammar, all 8-bit characters can be 
25 generated, for example: generation of 8-bit 0, 00000000: 

S sSo sOSi s00S 2 s000S 3 s0000S 4 -> sOOOOOSs sO00000S 6 
sOOOOOOOS? sOOOOOOOOSs sOOOOOOOOe 
for example: generation of 8-bit 255, 11111111: 

S ■» sS 0 slSi sllS 2 SIIIS3 SIIIIS4 SIIIIIS5 sllllllSe 
30 slllllllS7-»sllllllllS8-*slllllllle 
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for example: generation of 8-bit 85, 01010101: 

S sS 0 ^ sOSi s01S 2 s010S 3 SOIOIS4 s01010S 5 s010101S 6 
SOIOIOIOS7 s01010101S 8 sOlOlOlOle 

5 The sequences necessary for the in vitro implementation are produced 

according to the NFR method, as described in Steps II and m. Algomers and 
logomers are manufactured as described in the inventive Steps IV and V. The 
contained logomers are preferably isolated and amplified by the method described under 
Isolation and amplification of logomers by cloning. The alphanumeric symbols 

10 generated in this manner can now be used for various technical purposes (e.g. marking 
and identifying substances), and, possibly after additional technical steps, read out in 
accordance with the inventive method for reading information from 
information-carrying polymers (as described above). 

Since the generated binary patterns can be read, as desired, as data or symbols, 

15 they can be interpreted as alphanumeric symbols, for example. If they are interpreted 
as numbers, then the stated grammar generates 8-bit random numbers. 

From a sufficiently large set of random 8-bit binary patterns, it is possible to 
isolate all 8-bit patterns and set up as a library (e.g. for representing all alphanumeric 
characters). 

20 The above-described methods enable for example the representation of character 
strings: 

Since the algomers can be produced such that logomers of different length can 
be generated, it is also possible to represent character strings. However, for practical 
reasons, it is preferable to provide only few characters per logomer. For example, a 

25 single logomer can contain 4 bits (half-byte) or 8 bits (1 byte). (If more bits are 
necessary, then it is preferable to use only multiples of 8 bits = 1 byte.) In binary 
representation, it is then possible to represent any alphanumeric character in any coding 
(e.g. ASCII, ANSI, ISO 8859-x, Unicode.) One character can also span several 
logomers (for example: half-byte representation, in which a 1-byte character spans two 

30 logomers, or Unicode, in which a 16-bit character spans two 8-bit or four 4-bit logomers. 




For the representation of character strings, it is then necessary to provide the individual 
characters with positional information. For that, it is sufficient to use for that grammar 
different terminators (e.g. different "start" terminators), whose sequences represent the 
corresponding positional information. 
5 Example: 

Given is a grammar G = (X, V, R, S) with a terminal alphabet X : = {0> 1» e > s o, 
si, s 2 , s n -i, s n }, a set of variables V := {So, Si, S 2 , S 3 , S 4 , S 5 , S 6 , S 7 , Sg}, a start 
symbol S and a set of rules 
R .= 
10 { 

S := SiS 0 
So OSi 

50 IS, 

51 0S 2 
15 Si 1S 2 

S 2 -» 0S 3 
S 2 1S 3 
S 3 0S 4 
S 3 -> 1S 4 
20 S 4 0S 5 
S 4 ISs 
S 5 -»0S 6 
S 5 -» 1S 6 

s 6 -»os 7 

25 S 6 -»1S 7 
S 7 -> 0S 8 
S 7 1S 8 
S 8 -»e 

} 

30 wherein: 




Si := start with i= {0, n} 
e := end. 

The 1-byte alphabet generated with this grammar is sufficient, for example, for 
5 all alphanumeric characters of the ASCII, ANSI or ISO 8859-x standards. For a 
character string of length n, n different terminators are necessary, so that the character 
strings generated with this grammar are constructed as follows: 
soxe, sixe, s n -ixe, s n xe 

wherein x is any binary representation with, in this case, 8 bits. 
10 In this case: 

s 0 xe := character x at position 0 (0 th character) 
sixe := character x at position 1 (1 st character) 
s 2 xe := character x at position 2 (2 nd character) 
etc. 

15 With this grammar, it is possible to represent for example the character string 

"Elisabeth" in ASCII code: 

soOlOOOlOle, siOllOllOOe, s 2 01101001e, s 3 01110011e, s 4 01100001e, s 5 01100010e, 
s 6 01100101e, s 7 01110100e, s 8 01101000e. 

Alternatively, it is also possible to select £ •= {°> 1> s > eo> e i> e 2> en-i, e n }, 
20 whereby the character string "Elisabeth" in ASCII code is represented as: 

sOlOOOlOleo, sOUOHOOei, s01101001e 2 , s01110011e 3 , s01100001e 4 , s01100010e 5 , 
s01100101e6, s01110100e 7 , s01101000e 8 . 

A further possibility is £ : = {0, 1, so, si, s 2 , s n -i, s n , eo, ei, e 2 , ea-i, e n }, in 
which case the character string "Elisabeth" in ASCII code is represented as: 
25 so01000101eo, siOllOHOOei, s 2 01101001e 2 , s 3 01110011e 3 , s 4 01100001e 4 , s 5 01100010e 5 > 
s 6 01100101e6, s 7 01110100e 7 , s 8 01101000e 8 . 

In half-byte representation with positional information, the character string 
"Elisabeth" in ASCII code is represented as: 

soOlOOe, siOlOle, s 2 0110e, s 3 1100e, s 4 0110e, s 5 1001e, s 6 0111e, s 7 0011e, s 8 0110e, 
30 s 9 0001e, SioOllOe, snOOlOe, si 2 0110e, si 3 0101e, swOllle, Si 5 0100e, si 6 0110e, Si 7 1000e. 
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Because of the positional information, which is represented by the sequence of 
terminators, the individual characters can be processed independently from one another, 
and read independently from one another, for example. 

With such an alphabet, the representation of any data type is possible. For example, it 
5 is possible to combine two bytes each (0 th + 1 st , 2 nd + 3 rd , n* + n+1*) to a 2-byte 
code (e.g. Unicode). Alphanumeric character strings can be used to represent any 
indicator , name, numbers, date, etc. 

The sequences necessary for the implementation in vitro are produced 
according to the NFR method, as described in Steps H and m. Algomers and 
10 logomers are produced as described in the inventive Steps IV - V. The resulting 
logomers are preferably isolated and amplified by the method described under Isolation 
and amplification of logomers by cloning. The alphanumeric symbols generated in 
this manner can now be used for various technical purposes (e.g. marking and 
identifying substances), and, possibly after additional technical steps, read out in 
15 accordance with the inventive method for reading information from 
information-carrying polymers (as described above). 

The above-described methods enable for example the implementation of a random 
number generator: 

For certain problems in computer science (e.g. simulations) and mathematics, 
20 random numbers are required. Random numbers are ordinarily generated by 
computer-based algorithms. However, since those algorithms are deterministic, the 
generated random numbers are only pseudo-random numbers. In addition, the 
generated number series are randomized at varying quality, due to the varying quality of 
the algorithms. 

25 For some applications, however, real random numbers are required. In that 

case, a physical process has to be included in the generation of random numbers. This 
process can be for example the noise of the sound card in a PC, which is then processed 
by corresponding software. 

Using the methods described in this specification, it is possible to implement a 

30 real random number generator, which generates random numbers much faster than is 



possible with conventional methods. 

The random number generator is implemented with the following grammar 
(grammar for binary random numbers of any length): 
G = (I, V, R, S) with X := {0, 1, s, e}, V := {S} 
5 R:= 
{ 

S := sA 
A OA 
A 1A 
10 A-»e 

} 

wherein 
s := start 
e := end 

15 with this grammar, it is possible to form words with the above-mentioned 

terminal alphabet 
£ := {0, 1, s, e} 

All words of the language L begin with an "s" and end with an "e", and a 
random number (including 0) of random bits ("0" and "1") is contained therebetween. 
20 Example: Words of the language L, that can be formed according to R: 
S sA se 
S -> sA slA sle 

S sA sOA sOOA sOOlA -» sOOlOe 

S sA slA -> slOA slOOA slOOOA slOOOOA S100000A 
25 slOOOOOOe 

The binary patterns generated as words of the language can be read as letters, 

numbers, alphanumeric characters, or character strings. When reading them as the 

binary representation of numbers, then the above binary patterns are, in decimal 

representation: 
30 0 (empty) 




1 

2 

32 

and can be used as random numbers. In principle, any of the sequences is suitable for 
5 in vitro implementation, as long as they are unambiguous and maximally dissimilar to 
another. 

For example, it is possible to use the following algomers: 
sA: 

agctttatatctccatttgccctagtgaag 
10 aatatagaggtaaacgggatcacttcaacc 

A OA: 

ttggcgagatatcaacgccaccccttgctt 
gctctatagttgcggtggggaacgaaaacc 

15 

A-> 1A: 

ttggcgacccgggaaacaactattgctagt 
gctgggccctttgttgataacgatcaaacc 

20 A e: 

ttggtgcgggagttggaagcaactacgatg 
acgccctcaaccttcgttgatgctacctag 

The necessary sequences are commercially available. They can, for example, 
25 be ordered as 40nmol, PAGE-purified (ARK-Scientific, Darmstadt) and taken up by 
distilled water (recommended storage at -20° C). Algomers are produced from these 
sequences in the manner described in Step IV of the present invention. The algomers 
are polymerized to logomers, as described in Step V of the present invention, and can be 
used to read out information from information-carrying polymers (as described above). 
30 Random numbers generated like this are shown in Fig. 6. 
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The above-described methods enable for example the encryption of logomers: 

The information contained in logomers can be encrypted. For this, at least 
one of the primers needed for the readout PCR functions as the secret key (see Fig. 11). 
It is preferable that this is one of the primers priming in the terminators. If the 
5 sequence of this primer (key sequence) and the primer itself are known only to 
authorized personnel, then the information contained in the encrypted logomers is also 
accessible only by the authorized personnel. 

In order to prevent the information contained in the logomers from being read 
out without that key, further precautions are necessary. Possible attempts to break the 

10 encryption could be based on reading the logomers by amplifying non-secret sequences. 
If bacterial vectors are used for the cloning of logomers, a potential attacker could try, 
for example, to amplify the entire cloning site by PCR and then read it by sequencing, 
or to "read in" the resistance genes necessary for the selection into the cloning site. 
Also, the elongators themselves might serve as attack points for PCR -based reading of 

1 5 the secret sequence(s) . 

What these attacks have in common is that they might try to decrypt the key 
sequence with non-secret sequences, and thus read the encrypted sequence. A defense 
against such attacks is possible by using only completely secret sequences for the 
information transport via logomers. However, this might be too labor-intensive and 

20 expensive. Instead, or supplementally, it is possible to mix the information-carrying 
logomers with sequences (dummy sequences) that contain the same potential attack 
points (e.g. bits, resistance genes), but each containing different key sequences. With 
these dummy sequences, potential attacks are successfully blocked. For unauthorized 
accesses, all individual sequences are indistinguishable and therefore the encrypted 

25 information remains concealed. The more dummy sequences are mixed into a logomer, 
the more difficult it becomes to break the key sequence. 

This method corresponds to molecular stenography and is illustrated by the 
following example of the encryption of a character of a 1-byte alphabet: 
The used grammar is G = (£, V, R, S) with terminal alphabet X *= {0, 1, Sk ey , So, si, S2, 

30 sm, s f , e}, wherein feW,f S 1. s key is the secret key. For a key length of 1, the 
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maximal theoretical encryption is Key max = a 1 , the maximally usable encryption Key e ff = 
a l ~ d wherein d is the number of bases, in which the dummy key has to be different from 
the correct key, so that in the readout PCR no dummy key reads the correct sequence 
and conversely, the correct key cannot read out dummy logomer (but only the target 
5 logomer). For highly specified PCR conditions, it is possible to use d S 1. 
Furthermore, Keyimp = number of dummy keys + 1 is the actually used encryption, and 
Keymin = 1 + x is the minimum encryption. Here, x is the number of dummy logomers 
necessary for a minimum encryption. In the case of using n logomers as information 
carriers, x can be selected to be larger than n. 

10 For character strings, there is automatically an additional encryption, because 

the order of the characters, which is not known to the potential attacker, acts as an 
additional encryption. 

Example: The following grammar G is used for a grammar with character 
strings of n 1-byte characters: 

15 The used grammar is G = (£, V, R, S) with terminal alphabet £ := {0, 1, Skey, 

SkeyO, SkeyU Skey2, Skeyn-1, Skeyn, So, Si, S2, Sf-i, Sf, e}, wherein n, f € W, n 1. n 

expresses the number of used real keys, f is the number of used dummy keys; the used 
set of variables V and the used set of rales R can be selected freely and depending on 
the information to be encoded. 

20 Grammars with different character coding function analogously. 

The sequences necessary for the implementation in vitro are produced 
according to the NFR method, as described in Steps IT and m. Algomers and 
logomers are produced as described in the inventive Steps IV and V, and the contained 
logomers are preferably isolated and amplified by the method described under Isolation 

25 and amplification of logomers by cloning. The logomers generated in this manner can 
be read out with the readout PCR described above, wherein - as described above - one 
of the primers necessary for readout, preferably one of the primers priming in the 
terminators, functions as the secret key. With the grammar described above, "dummy 
sequences" ("dummy logomers") are generated in addition to the information-carrying 

30 logomers, which are supposed to prevent unauthorized readout access to the 
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information contained in the logomers. The method can be used for all applications of 
logomers and has, among others, the advantage to be very effective due to the 
specificity of the primer used as the secret key. 

Based on the above-described method, it is possible to implement symmetric 

if . - . . 

5 and asymmetric encryption schemes. L For a symmetric encryption scheme, it is 

sufficient if the communicating participants A and B agree on a secret key sequence. 

Participant B encrypts a message stored as a monomer by manufacturing logomers only 

with the secret terminators and mixing additional logomers that carry other terminators 

into the message. Only A is then in position to provide the primer pair that is 

10 necessary for readout, because only A knows the necessary terminator sequence. 

An asymmetric encryption scheme, on the other hand, necessitates additional 
effort, such as the application of molecules with irregular forms (see for example Fig. 
12 and Fig. 13): When a communication partner B wants to send an encrypted 
message to A, then B receives a public key from A, which B uses to encrypt the 

15 message represented by a logomer. The public key A includes a pair of terminators 
and a pool of additional dummy terminators with similar properties, but deviating 
sequences and other 3 '-overhang sequences. Of the genuine terminators, only the 3' 
overhang sequence is known, with which it can be linked to the - publicly known - 
elongators. To encrypt a message, B generates logomers, using the key that was 

20 publicly provided by A (and which includes terminators and dummy terminators) for 
the symbol polymerization reaction. This configuration of the terminators ensures that 
only A is in a position to read out the thusly encrypted message: Since only A knows 
the actual sequence of the genuine terminators, only A can read, by readout PCR, the 
message that has been encrypted as logomers. 

25 Since the public key can theoretically be attacked via the known overhang 

sequence to the elongators (because the dummy terminators may not possess this 
overhang sequence, a potential attacker can link any sequence to the genuine 
terminators, whereby the sequence of the terminators can be identified), molecules 
based on irregular shapes, such as based on the Y-shaped molecules shown in Fig. 12, 

30 are used for the terminators. The illustrated Y-shaped molecules can be combined 




further to tree-like branching molecules. The root of such a terminator molecule 
realized as a tree then contains the overhang sequence compatible to the elongators, 
whereas only one of the branches of the tree contains the genuine terminator sequence. 
Since, as explained above, additional dummy logomers are mixed into the encrypted 
5 message, only A, who knows the genuine terminator sequence, can filter the proper 
message from the set of molecules by linking with the compatible vector, whereas a 
potential attacker who does not know the genuine terminator sequence cannot do this. 
The above-described methods enable for example encryption with real-time 
identification: 

10 Without reading out the information contained in encrypted logomers, the 

presence of the key sequence itself can be indicative of whether a marked product is 
genuine or not. The information about the authenticity of the marked product can be 
verified or falsified in a process in which the key primer serves as hybridization probe 
of a fluorescence detection reaction. 

15 a further object of the present invention is therefore the use of 

information-carrying polymers, in particular of the information-carrying polymers 
obtained in accordance with the present invention, for the encryption of information. 
The above-described methods enable for example a test for the quality of 
oligonucleotides 

20 A typical problem regarding the production of oligonucleotides, for example 

for molecular biology, is the quality control of the oligonucleotides, which tends to be 
problematic because of the production process and the varying quality of the used 
chemicals. 

The method described the Steps I - V of the present invention can be used to 
25 test the quality of oligonucleotides. For this, a binary grammar such as for the random 
number generator (see above) or a unary grammar such as for the production of 
molecular weight standards (see below) is used for the generation of logomers of any 
length. Under otherwise identical conditions, the length of the logomers obtained by a 
symbol polymerization (Step V of the method of the present invention) is then directly 
30 proportional to the quality of the used oligonucleotides: The longer the 




electrophoretically visualized logomers are, the better is the quality of the used 
oligonucleotides. An example of the ladder of logomers of different length obtained 
with a symbol polymerization is shown in Fig. 3. 

A further object of the present invention is therefore the use of 
5 information-carrying polymers, in particular of the information-carrying polymers 
obtained in accordance with the present invention, for the quality control of 
synthetically manufactured oligonucleotides. 

The above-described methods enable for example the manufacture of logomers as 
markers and signatures 

10 Because of their ability to represent, in principle, any piece of information, the 

logomers described here can be used as markers for labeling manufactured goods, 

products, substances, and devices. 

A further object of the present invention is therefore the use of 

information-carrying polymers, in particular of the information-carrying polymers 
15 obtained in accordance with the present invention, as markers or signatures. 

The logomers can be used as a "binary label," that is added to a product, and that 

contains information about the thusly labeled product. The logomers can contain e.g. 

information about 

a) manufacturer 
20 b) product (e.g. serial number) 

c) application purpose 

d) substance class 

e) danger class 

f) quality 
25 g) purity 

h) production and expiration date 

It is possible to provide substantially any product or manufactured goods with 
substantially any information. Since the logomers are non-toxic and biodegradable, 
they can also be used for highly sensitive products, such as food, medical and 
30 pharmaceutical products. 
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Labeled products can be for example: 

a) chemical products, e.g. varnishes, paints, oils, lubricants, fuels, solvents, inks, etc. 

b) liquid materials: solutions, suspensions, emulsions . 

c) genetically engineered products 

5 d) food, e.g. for the labeling of genetically altered ingredients or for monitoring of 
foodstuffs during the production process 

e) medical and pharmaceutical products 

f) paper products, documents, money 

g) devices 

10 There is a need to label products in a lot of different fields and due to different 

requirements. Reasons for labeling can be for example: quality assurance during the 
production process, supervising product purity and avoiding contamination, preventing 
counterfeiting and product piracy, detection of certain product components, and product 
labeling with the desired production information. The need to label also exists in 

15 sensitive areas, such as the labeling of food, medical and pharmaceutical products and 
genetically engineered or modified products. Here, it is desirable that information 
about the product is available directly on the product, in particular when it is desirable 
to avoid mix-ups or counterfeits and also to identify different components of a product. 

There are numerous examples for the necessity to label products. For 

20 example, logomers can be used as serial numbers that are mixed into car paint, so that in 
the case of an insured event (e.g. accident or theft) it is possible to identify the car 
owner. By product labeling, it is possible to supervise the quality and purity of 
chemical products and avoid contaminations. A permanent problem is the 
counterfeiting of documents, signatures and money, which necessitates highly 

25 developed labeling methods. 

In particular with animal diseases and epidemics (such as BSE, swine fever, 
salmonellosis, etc.) in mind, there is the problem of quality assurance of foodstuffs and 
the labeling of end products, in order to keep contaminated products away from 
consumption. A further problem is food that contains genetically engineered 

30 ingredients. These ingredients cannot, or only with tremendous difficulty, be detected 



in the end product. This problem is aggravated by the fact that many foodstuffs 
contain ingredients of various origins and the production routes are often unclear. 

Labeling is also a problem in the field of blood products, in which 
contamination may occur by pooling. Also in this field, it is desirable to have 
information about the identity, origin, production date and expiration date available 
directly on the product. This is also true for medical and pharmaceutical products. 

Conventionally, various carbon compounds, polyanilines, liquid crystals or 
other chemical compounds are used to label and mark colors, paints, fuels, etc. The 
used substances and compounds have various deficiencies: For example, some of 
them are toxic, hardly biodegradable, the available information capacity is severely 
limited, they interact chemically with certain substances or are expensive and their 
production is labor-intensive. 

For the use of labeling sensitive and highly sensitive products such as food and drugs, 
the above-mentioned substances and compounds are not suitable at all, due to the 
mentioned deficiencies. 

The requirements of a method for labeling any product include at least: 

• The label should be available directly in or at the product; 

• the label should be easily detectable; 

• the label must be harmless to health. 

In contrast to conventional compounds, the logomers described herein have the 
following advantages: 

• coding of any information, 

• high storage capacity, 

• harmless to health, 

• chemically, biologically, pharmacologically and genetically neutral, 

• easily biodegradable (biomolecules), 

• no environmental harm, 

• easy detection, 

• high compatibility to existing techniques of computer science and molecular 
biology (PCR), 
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• information can be encrypted and are also suitable for authentication and 
encryption, 

• identification of individual ingredients in product mixtures, 

• cheap to produce in large quantities (cloning, amplification with bacteria in 



These properties make logomers much better suited for the mentioned purposes than the 
conventional techniques. Moreover, application fields are opened, that have been 
closed to conventional technologies. 

Because of their properties, nucleic acid based logomers can also be used for 
10 labeling particularly sensitive products such as food and pharmaceutical products. 

The reasons for this are grounded in the chemical nature of nucleic acid based 
logomers on the one hand, and in the coding of information in logomers on the other 
hand: 

1) As the "basic building blocks of life," nucleic acids are elementary building 
15 blocks of all known biological life forms. Thus, they cannot be harmful to 
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fermenters). 
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any known biological organism due to their chemical nature ("biochemical 
compatibility"). However, due to the contained genetic information (of the 
contained program), nucleic acids may very well be potentially harmful when 
they are read by an organism: If they encode for example toxic proteins or if 
(as in the case of viral genes) they can "reprogram" the target organism. 
However, due to the following reasons, the logomers used here cannot contain 
information that is harmful to an organism. 



25 



2) 



Logomers do not contain genetic, i.e. biologically relevant information 
("semantic incompatibility"): Regarding the contained information, logomers 
correspond to letters, numbers or series of letters and numbers in binary coding, 
that are readable to us or a computer, but not for the genetic apparatus of an 
organism. For a biological organism, they are simply "nonsense code". 



3) 



Logomers, and in particular the algomers from which they are assembled, are 
too short even to contain accidentally biologically relevant information. 



30 4) 



In order to preclude any possibility (i.e. the unlikely event that the binary 
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information of the logomers is in some way interpreted by an organism as 
genetically relevant information), the sequences used for the coding are 
provided with stop codons in the case of sensitive application fields, so that 
they can never be read by a biological organism. 
5 5) As elementary components of all biological organisms, nucleic acids are 
ubiquitous in the environment. They are degraded within an extremely short 
time. 

In order to be used as a marker, logomers can be generated according to the 
Steps I - V of the present invention, and amplified in accordance with the methods of 

10 the present invention as described above. Depending on the application, it is possible 
to use different methods and different regenerative processes to amplify the logomers. 
In order to label petrochemical products for example, it is possible to amplify the 
logomers by cloning in bacteria, for example. To avoid unnecessary contamination, 
the resulting bacteria can be lysed. A special cleaning of the logomer DNA is usually 

15 not necessary. 

To manufacture logomers for sensitive and highly sensitive products, however, 
a more elaborate amplification and cleaning method is necessary. Its goal is to obtain 
maximally pure logomers. To avoid biological intermediate steps, the amplification 
can be carried out by PCR, as described in accordance with the present invention. If 
20 amplification by cloning is carried out, then the cleaning of the resulting DNA of 
bacteria is necessary. If the logomers are cloned in bacterial vectors, then a targeted 
degradation of these resistance genes (e.g. by restriction enzymes) is part of any further 
processing. 

Logomers can be connected immediately to the product to be labeled. The 
25 logomers can be mixed into liquids, for example, or applied to solids. In the case of 
genetically engineered products, the logomers can be introduced into the product to be 
labeled by recombination or cloning techniques. 

The labeling of substances or products by logomers enables among others the 
monitoring of production processes, quality control, the identification of individual 
30 components and the quantitative and qualitative detection of contaminations. 




An aspect of the present invention is explained in the following, with an 
example of manufacturing computer-compatible data (bytes/multibytes) of nucleic acids 
and their use in marking various, substances. However, the present invention is not 
limited to this example. 
5 LA coding of bytes and multibytes in form of nucleic acids is defined. 

2. Suitable sequences for the representation of bytes and multibytes are 
constructed. 

3. A library of byte-representing nucleic acids is manufactured in vitro. 

4. Bytes are linked to multibytes. 

10 5. Biochips for the identification of bytes and multibytes are manufactured. 

6. Different (organic and inorganic) materials and objects as well as individual 
genes are labeled (and, if necessary, encrypted) by bytes or multibytes. 

7. The information of labeled materials, objects or individual genes is read out (and, 
if necessary, decrypted). 

15 Explanation of 1: The representation of data with nucleic acids is supposed to ensure 
that the data are computer-compatible. For this purpose, the representation by bytes 
and multibytes is selected, as it is the most universal representation of computer data. 

For this, a grammar G C 32 is defined, with G C 32 = (Z, V, R, S), terminal alphabet 
X := {*m, Om, s m , e m }, variable alphabet V := {A, B, C}, start symbol S and set of rules 

20 R := {S := oA, A s m B, B Ce^C -> x n } wherein n, m € W, n = {0, 1, 255} 
and m ^ 0. The grammar describes the construction of molecules of the form osxe, 
wherein x can take on any byte value between 0 and 255, s describes the byte position, 
and o and e are provided for the concatenation of individual bytes into multibytes (see 
Fig. 29 and Fig. 30). To simplify the manufacturing process, restriction sites are used 

25 as variables. 

In addition, the used nucleic acids must have logical, physical, chemical, and 
biological properties. In particular they should be easy to amplify, easy to read, and 
easy to interact with biological sequences in a predefined manner. 
For this purpose, the data-representing nucleic acids are defined such that 
30 a) they represent bytes 
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b) they can also represent more complex data structures (multibytes) 

c) individual bytes can be linked into multibytes 

d) they can be read with a biochip 

e) they can be used for the manufacture of 1 -byte and multibyte biochips 

f) individual bytes and multibytes can be amplified by PCR 

g) they are clonable 

h) they can be linked with other nucleic acids so as to mark them 

i) they carry as few restriction sites as possible 

a) To let the data-representing nucleic acids represent bytes, exactly 256 
unambiguous sequences must represent all values of a byte. For this, the 
molecular bytes contain an unambiguous sequence x, of which there are 256 
different alleles. These alleles are referred to as x 0 to x 25 5 and enumerate 
exactly all byte values (x 0 = 1, x x = 1, x 25 5 = 255) (see Fig. 25, Fig. 26, Fig. 
27, Fig. 28 and sequence listing). 

b) In order to be able to represent more complex data structures (multibytes), such 
as 32-bit numbers and strings, the byte molecules are provided with position 
information. This information is contained in the sequence s (see Fig. 25, Fig. 
26, Fig. 27, Fig. 28 and sequence listing). Each position is represented by an 
unambiguous sequence si. All byte positions are then enumerated by the 
different alleles of s (sO = position 0, si = position 1 , . . . etc.). 

c) Although for the representation of multibytes, it is in principle sufficient to 
distribute the data to be represented over independent bytes with the 
corresponding position information (as multi-strand multibytes: for example, to 
provide a paint with a serial number of four bytes length, it is possible to mix 
four individual bytes, each having exactly one position information s 0 to s 3 , 
independently into the paint), it is sometimes preferable to link bytes to 
single-stranded multibytes. For this purpose, the bytes carry the sequences o 
and e (see Figs. 25 to 28 and sequence listing), with which they can be linked 
by ligation or overlap assembly and PCR to single-stranded multibytes (see Fig. 
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29 and Fig. 30). These single-stranded multibytes can then be used as 
markers. In the case of parallel overlap assembly, the individual bytes are cut 
out with an appropriate restriction enzyme (here: EcoRV, see Fig. 28). 
Data-representing nucleic acids can be read with sequencers, for example. 
However, in order to read them as easily and fast as possible, it is suitable to 
read them using biochips. For this purpose, biochips are manufactured that 
include all 256 different x-sequences in single-strand form (and thus all byte 
values) in ascending order (see Fig. 32 and Fig. 33). Biochips with this 
configuration are also an object of the present invention. The reading of a 
certain byte value Xi is carried out by denaturing the corresponding data 
molecule to single strands and hybridization with the chip. The hybridized 
sequences are marked, in correspondence to the typical operation principles of 
biochips, for example by fluorescence marking. After the hybridization of 
data molecules with a 1-byte chip, exactly one marked position on the biochip 
is obtained, which thus marks exactly one byte value (see Fig. 36). For 
example, a data molecule that carries the sequence X192 hybridizes exactly at 
position 192 on the chip, and thus it can be identified by its position on the 
chip. 

Since the principle of reading the data molecules with biochips is based on the 
hybridization of complementary sequences, the bytes are configured such that 
they can also be used for the manufacturing of the corresponding biochips. 
For this, the byte molecules carry enzymatic restriction sites, so that from one 
complete byte molecule either the sub-sequence containing x or the 
sub-sequence containing s and x can be cut out. These sub-sequences are then 
denatured and spotted onto a chip carrier. In order to make a 1-byte chip, all 
256 x-sequences are applied in ascending order on the chip. Thus, it is 
possible to identify a sequence with previously unknown x by its hybridization 
position. When only the x sub-sequence is used to make the 1-byte chip, then 
so-called X-chips are obtained (see Fig. 32), and if the sub-sequence including 
s and x is used, then so-called SX-chips are obtained (see Fig. 33). X- and 
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SX-chips can be combined to form multibyte chips. A further condition for 
the manufacture of biochips is the availability of sufficient amounts of 
sequences. To ensure this, the byte molecules can be made directly with 
oligosynthesizers, they can be cloned, or they can be amplified from few 
template molecules by PCR. 

In order to explicitly amplify individual bytes (e.g. from single-stranded 
multibytes), the sub-sequences referred to as o, s and e can serve as priming 
sites (see Fig. 25, Fig. 26, Fig. 27 and Fig. 28). 

In order to be clonable, the byte molecules carry sticky ends that are 
compatible to restriction sites (e.g. Xhol and Xbal, see Fig. 28). 
In order to link bytes and multibytes more easily with the desired nucleic acids, 
it is possible to append further terminators (adaptors) to bytes and multibytes 
by ligation or by overlap assembly and PCR (see Fig. 29 and Fig. 30). These 
adaptors carry restriction sites or recombination sites, so that they can be linked 
to other nucleic acids either by restriction and ligation or by recombination. 
This is in particular advantageous for the marking of nucleic acids, in particular 
genes. 

In the context of linking bytes and multibytes with other nucleic acids, it is 
preferable that in particular multibytes carry only few restriction sites, so that 
the user of the nucleic acid that is provided with the bytes and multibytes can 
work with them as usual and does not have to do without any restriction sites, 
which otherwise destroy the bytes. Therefore, the bytes are configured such 
that after the manufacture of single-stranded multibytes, of all restriction sites 
that are based on sequence palindromes of length 6 only the restriction sites 
Afill, Nhel, and Ncol are contained in the multibytes and therefore cannot be 
used like that, for example, for nucleic acids marked with multibytes. If 
necessary, it is even possible to remove the remaining restriction sites from the 
bytes and multibytes without losing the byte and position information, by 
amplifying multibytes by PCR and mutant primers such that the restriction sites 
are blocked. 




Explanation of 2: The sequences necessary for implementation are synthesized using the 
NFR method described above. The sequences are configured such that all sequences 
are as dissimilar to one another as possible. The x sub-sequences have a GC content of 
50%, and the o, s, and e sequences have a GC content of 66%, multibytes contain three 
5 sequence palindromes of lengths 6, which are recognized one each by Aflll, Nhel and 
Ncol. 

Explanation of 3: The byte molecules can be manufactured directly using an 
oligosynthesizer. In order to manufacture larger amounts of byte molecules, it is 
however advantageous to devise a clone library containing all byte molecules. 

10 Explanation of 4: Due to the included byte position information, bytes can represent 
multibytes without being directly linked to one another (logical linking to 
multi-stranded multibytes). However, it is advantageous for some applications, such 
as the labeling of nucleic acid strands, to link the individual bytes physically together to 
one multibyte (single-stranded multibyte, see Fig. 30 and Fig. 31). To accomplish this, 

15 the individual bytes are linked to one another either by ligation or by overlap assembly 
and PCR. 

Explanation of 5: Biochips for reading bytes and multibytes are manufactured 
with sub-units of the molecules that function as bytes. In principle, there are at least 
two architectures: the X-chip (see Fig. 32), which is manufactured with 

20 x-sub-sequences (see Fig. 25 to Fig. 28), and the SX-chip (see Fig. 33), which is 
manufactured with sx-sub-sequences (see Fig. 25 to Fig. 28). Both chips are 1-byte 
chips and can be combined to multibyte chips. Their operation principle is similar: 
The reading of information from data molecules is carried out by hybridization with the 
chip. The byte value of a data molecule can be determined by its hybridization 

25 position. 

Thus, it is also possible to use multibyte chips as data storages (ROM and 
RAM): The hybridization of the chip with a specific data molecule corresponds to 
assigning a value to a storage cell. The storage cell can then be erased by denaturation 
(e.g. by temperature increase or change of pH). In addition, it is also possible to 
30 provide specific sequences that are hybridized with the chip with optically active 




molecules of different colors, so that specific color patterns can be generated and the 
multibyte chips can be used as a display. Combining sequences of different melting 
temperature with optically active molecules of different colors, a color display in 
accordance with a thermometer can be devised. 

5 The difference between the X-chip and the SX-chip is that the SX-chip can also 

read (and store) byte positions in addition to the byte value. The advantage of the 
X-chip is that very large multibyte arrays can be made of identical X-chips. The 
advantage of the SX-chip is that it can read single-stranded multibyte molecules, 
without having to hybridize the individual bytes separately. 

10 Explanation of 6: With the adaptors L and R (see Fig. 29 and Fig. 30), it is 

possible to connect sequences with bytes and multibytes serving the purpose to connect 
them with nucleic acids. The adaptors contain for example suitable restriction sites or 
recombination sites. For example, a multibyte can be cloned in a plasmid using 
suitable restriction sites, whereby the plasmid receives a label. This label can be for 

15 example a 32-bit serial number (see Fig. 31). Another application example is 
providing a data molecule with recombination sites, wherein the data molecule is 
introduced for example into non-coding regions (e.g. introns) of genes, thus labeling 
these. 

Explanation of 7: The information in substances, objects and molecules that 
20 have been labeled by data molecules can be read out by extracting it from the labeled 
substance, cleaning it if necessary, amplifying it if necessary by PCR and then 
sequencing it or identifying it by hybridization with a byte or multibyte chip, as 
described above. 

The methods described above thus enable the unambiguous labeling of organic 
25 and inorganic substances and objects as well as of nucleic acid constructs and genes. 

The methods described above also enable the manufacturing of data storages 
and optical displays: To store computer data, multibyte arrays are used. The writing 
of a single byte is carried out by hybridizing one 1-byte chip with a nucleic acid (that is 
optically labeled, if necessary), that contains a defined x-sequence exactly once, that 
30 x-sequence corresponding to the value to be written (e.g. X192). The reading of the data 




is performed like the reading out of a biological DNA array, e.g. by scanning. The 
erasing of data is carried out by denaturation and, if necessary, subsequent cleaning. 
The above-described methods enable for example the labeling of single molecules, 
nucleic acids and genetically engineered or modified products: 
5 The logomers described herein can be used to label genetically engineered or 

modified products. 

Conventionally, "classical" biomolecular methods such as PCR, restriction 
digestion, Southern blot hybridization, and sequencing have been used to identify 
genetically engineered and modified products in food. However, with these methods, a 
10 characterization of genetically engineered products and ingredients is very 
labor-intensive and also necessitates separate methods and processes for each product to 
be characterized. 

All in all, there is so far no universal labeling method for genetically 
engineered products. Such a labeling method must fulfill several criteria: 
15 •the label must be absolutely harmless to health, 

• the label should be inseparably linked to the labeled product, 

• the label should be forgery-proof, 

• the label should be able to store enough information, 

• the label should be detectable and readable in small amounts, 

20 The use of nucleic acid based logomers to label genetically engineered products 
satisfies these criteria and furthermore has the following advantages: 

• logomers can store practically any information; 

• individual ingredients can be detected fast and with high sensitivity; 

• a method that is standardized in detail can be applied to read out the information 
25 (readout method of the present invention, for which standardized primers can be 

used); 

• the sequence of the searched genes can be completely unknown; 

• the sequence of the detected genes can be secret, without compromising 
identification or authentication; 

30 • additional security mechanisms can be implemented additionally by encryption. 



In particular, the labeling of genetically engineered or modified products is 
accomplished for example as follows: 

To manufacture markers, any suitable grammar is used, e.g. the one for the random 
number generator, the 1-byte alphabet, or for character strings as described above. The 
characters needed to represent the markers are taken from the characters generated with 
the grammar and added to the product to be labeled. The characters manufactured as 
logomers can be introduced in the following ways in the product to be labeled: 

a) by admixture; herein, the logomers are manufactured, isolated, and amplified, as 
described in the method of the present invention. 

b) by cloning, as described in the method of the present invention; Fig. 6 shows 
for example the labeling of bacterial clones by logomers, which have been 
produced by using the grammar for the random number generator described 
above. 

c) by restriction and ligation; herein the logomers are manufactured, isolated and 
amplified, in accordance with the present invention. The logomers obtained in 
this manner are then connected by restriction and ligation with the nucleic acid 
to be labeled. For this, the terminators (see Fig. 2) and adaptors (see Fig. 30) 
described above can carry the desired restriction sites. 

d) By using recombinative techniques; herein the logomers are manufactured, 
isolated and amplified, in accordance with the present invention. The logomers 
obtained in this manner are then introduced by recombinative techniques into the 
product to be labeled. For this, it is possible to use for example the methods of 
gene targeting by homologous recombination [Capecchi M.R., Altering the 
genome by homologous recombination. Science, 244(4910), 1288-1292, (1989)] 
or e.g. the Cre-IoxP System [Kilby, N.J., Snaith, M.R. & Murray, J.A., 
Site-specific recombinases: tools for genome engineering, Trends Genet., 9, 
413-421, (1993)]. For example, logomers obtained in accordance with the 
present invention can be set before or behind a certain nucleic acid sequence 
(gene), that they are supposed to label. In this case, the logomer is "tacked" as 
a "tag" to a genetically engineered product and can provide information about 
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manufacturer, product group, danger class, expiration date etc. (see Fig. 15), 
The method is in particular suitable for the labeling of single molecules, in 
particular single nucleic acids, and preferably single genes. 

The method is applicable to all organisms (microorganisms, plants, animals) For 
5 example, clones of microorganisms or plants can be labeled. The method is also 
suitable to label food and to label cattle in order to fight the cattle epidemic BSE. 
The above-described methods enable for example the labeling of food: 

Food can be unhealthful or even dangerous because of diseases, epidemics, or 
contamination. Examples are diseases such as BSE, swine fever and salmonellosis. 
10 It is desirable to make it possible to identify product and manufacturer, so that in an 
emergency, certain products or product series can be immediately taken from the market, 
examined, and proper counter-measures can be taken. 

For highly sensitive products such as food, the criteria for labeling genetically 
engineered products have to be applied stringently (see: The above-described methods 
15 enable for example the labeling of single molecules, nucleic acids and genetically 
engineered or modified products). 

All in all, there is also no universal labeling method for genetically engineered 
food products. Conventionally, the identification of genetically engineered or 
modified ingredients in food products is therefore difficult and labor-intensive. For 
20 such labeling, "classical" biomolecular methods such as PCR, restriction digestion, 
Southern blot hybridization, and sequencing are used to characterize the products. 

In the wake of the cattle epidemic BSE, an immunological labeling method for 
the labeling especially of cattle and cattle produce (i.e. beef and milk) has been 
suggested, which is based on an immunization of the animals to be marked with specific 
25 proteins ("Ear mark in blood traces down BSE cattle", Suddeutsche Zeitung Nr. 283, 
08.12.1998, page v2/9). With the production of specific antibodies caused by this 
immunization in the marked target animal, which is supposedly detectable in any kind 
of tissue, the label can be read with an immuno-assay (ELISA). However, such a 
method is on principle limited to higher vertebrates. Moreover, the available amount 
30 of information for this method is highly limited, the information-carrying proteins are 
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not heat-resistant, and the method requires comparatively large amounts of substances 
for immunization and detection reaction. 

As explained in the section The above-described methods enable for example 
the labeling of single molecules, nucleic acids and genetically engineered or modified 
5 products, logomers based on nucleic acids can also be used to for labeling food products. 
The logomers can contain information about the manufacturer, expiration date, and 
serial number, among others. The method for labeling food can be the same as 
described in The above-described methods enable for example the labeling of single 
molecules, nucleic acids and genetically engineered or modified products. Using the 
10 genetic recombination and cloning techniques listed in that section, single organisms 
can be labeled such that a single organism and all its descendants are identifiable. 
The method for labeling by logomers thus enables: 
• monitoring of the target product throughout the entire production and processing 
procedure down to the end user, 
15 • a unified, fast and simple identification of individual ingredients. 

Logomers based on nucleic acids are also suitable as an admixture to label 
beverages. 

The above-described methods enable for example the labeling of medical and 
pharmaceutical products 
20 Because of their non-toxicity, logomers based on nucleic acids are suitable for 

labeling sensitive and highly sensitive products. Besides foodstuffs, medical and 
pharmaceutical products can be labeled. In order to preclude mix-ups, contamination, 
and counterfeits, logomers can be admixed for example to medical and pharmaceutical 
products. 

25 The advantage of labeling with logomers is that the label is connected 

immediately to the labeled product, and products can be labeled individually, for 
example. In this manner, a monitoring of the production process is possible, products 
can be identified even after refilling them several times, and contamination as well as 
the pooling of probes can be detected. For example in the case of blood storage, it is 

30 possible to label production and expiration date, blood type, and manufacturer using 
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logomers. 

The above-described methods enable for example authentication and copy protection 

Logomers can be used for authenticating products, objects and devices. For 
this, the logomers are preferably generated and amplified in accordance with the present 
invention. 

If logomers obtained in this manner are applied or admixed to products, objects, 
and devices, then the information contained in the logomers can be read out in 
accordance with the present invention at any time. If the generation of the logomers 
was performed using encryption techniques, as described above, then the read-out 
necessitates authorized access. 

For example, logomers obtained in accordance with the present invention and 
dissolved in an aqueous solution (preferably of 10 3 to 10 18 molecules/^il) can be applied 
immediately on documents for labeling. As such, they can serve as a certificate of 
authenticity of a document marked that way. To read out the logomers serving as the 
label, a small shred of the paper (1mm 2 is sufficient) is used as a template of a PCR 
reading reaction in accordance with the present invention. An example of this is 
shown in Fig. 16. In the read-out experiment of this figure, a logomer in an aqueous 
solution of ca. 10 9 molecules/)!, was dried for about one hour on 3M Postlt paper and 1 
mm 2 of the Postlt was used as a template of a readout PCR for proof of principle. It is 
possible to use documents of any paper quality, such as 80 g/m 2 standard copying paper. 

The same method is also suitable for labeling money or bills, which is 
advantageous, because: 

• bills that have already been printed can be marked, 

• it is possible to assign serial numbers, 

• encryption can be used. 

In order to test the authenticity of documents, if possible in real-time, it is possible 
to use one of the primers used for readout in the PCR as a hybridization probe for a 
subsequent coloring detection reaction (e.g. with an intercalating colorant, such .as 
ethidium bromide). 

Another example is the authentication of liquid solutions, suspensions and 
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emulsions with logomers. Here, it is possible to mark inks individually with logomers, 
so that the authenticity of signatures can be tested. 

Another example is the labeling of automobiles by mixing logomers into the car 
paint. For this, logomers are added as an admixture to the car paint (or another 
5 component of the car). The added logomers can contain information about the serial 
number, manufacturer, car type, or the like. Based on this information, a vehicle can 
be identified in the case of an accident, theft, or illegal disposal. An advantage of this 
method is that already traces of the car paint are sufficient to identify the vehicle, which 
is particularly desirable for accidents. Another advantage is that it becomes very 
10 labor-intensive to fake the identity of a vehicle. To do this, first, all labeled 
components would have to be removed, and second, a fake identity would have to be 
established. This is already very difficult because of the complicated mechanical 
procedure, and moreover, the use of cryptographic methods can make the copying or 
mimicking of serial numbers as difficult as desired. To read out the information 
15 contained as logomers, the logomers are isolated from the car paint, and can then be 
read out by the method for the reading of information contained in logomers described 
above, preferably by PCR. For this, a sample of the paint is obtained, dissolved in a 
suitable solvent (such as turpentine) and the same volume of water is added. The 
following centrifugation separates hydrophilic and hydrophobic parts. Then, taking 
20 the aqueous phase (containing the hydrophilic nucleic acid based logomers), the 
logomers contained in the aqueous phase are precipitated by the regular methods (e.g. as 
described in [J. Sambrook, E.F. Fritsch, T Maniatis, Molecular Cloning, A Laboratory 
Manual, (1989)]. After the precipitation, the resulting logomers can be read out with 
the methods according to the present invention, such as by readout PCR. 
25 Thus, other objects of the present invention are the use of information-carrying 

polymers, in particular information-carrying polymers obtained in accordance with the 
present invention, for the purpose of quality assurance, copy protection, labeling of food, 
labeling of genetically engineered, chemical, medical, and pharmaceutical products, the 
labeling of organisms, the labeling of documents, the labeling of money, the labeling of 
30 objects and machinery, the labeling of solutions, suspensions and emulsions, as well as 
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the authentication of persons and objects. 

The above-described methods enable for example the production of a molecular weight 
standard 

One of the typical methods of biomolecular analysis is the electrophoretic 
5 longitudinal separation of DNA and RNA fragments. Usually, in order to determine 
the length of such fragments in electrophoresis, a molecular weight standard ("ladder") 
is used, that contains fragments of known length and that is used for the comparison 
with fragments of the samples to be analyzed. 

So far, there are molecular weight standards that contain either irregular ladders 
10 (e.g. Boehringer V, Boehringer Mannheim, Catalogue No.: 821705) or regular ladders 
with regular fragment-length distances (e.g. 50bp ladder Gibco BRL, Life Technologies, 
Catalogue No.: 10416-014). None of the conventional ladders has length distances 
that are shorter than lObp (e.g. lObp ladder Gibco BRL, life Technologies, Catalogue 
No.: 10821-015). 

15 With the methods described herein, it is possible to manufacture specifically 

DNA fragments of different length and predefined length intervals. These fragments 
can be used as molecular weight standards, and the method makes it possible to realize, 
in principle, any length and length distances. In particular, length distances up down to 
lbp interval, i.e. molecular weight standards of the highest resolution, are possible. 

20 The method is based on using logomers for the template of a readout PCR, 

whose result is a mixture of DNA fragments of predefined length and predefined length 
distances. It is based on the inventive method for generating and reading logomers. 

DNA fragments of various lengths in very short, regular intervals can be 
obtained when a unary polymer functions as a template for a readout PCR with nested 

25 primers. The 5' primer primes at the start sequence of the polymer. The anti-sense 3' 
primer is attained by a group of primers, that starts nested in the elongator (shown in Fig. 
17 as three levels of primers shifted with respect to one another). Since the polymer is 
made of repetitions of only one elongator molecule, n*m bands are obtained for n 
repetitions and m nested 3' primers. This method also works laterally reversed with a 

30 primer priming in the end terminator and corresponding anti-sense 5' primers. The 
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resulting bands can be used as molecular weight standards, e.g. in electrophoresis. 

For high resolution molecular weight standards, the use of unary logomers and 
nested primers is most preferable. However, the method is not limited to this, because 
in principle, all grammars described in Step I of the present invention can be used. 
5 Unary logomers of any length can be produced with a grammar G = (£, V, R, 

S) with a terminal alphabet £ := {0, s, e}, a set of variables V := {A}, a start symbol S 
and a set of rules R 
R := 
{ 

10 S:=sA 
a a A 
A ^ e 

} 

If it is necessary to produce unary logomers of a predefined length, then the set 
15 of variables and thus the number of elongators must be chosen to be larger (such as for 
example in the grammar for the implementation of a 1-byte alphabet described above). 
Depending on the desired length of the DNA fragments to be generated, it is also 
possible to vary the length of the algomers. 

The generated logomers can be isolated and amplified with the methods of the 
20 present invention. 

The generation of DNA fragments of predefined length and predefined length 
intervals is carried out preferably by PCR, similar to the inventive method for reading 
out logomers. For high resolution molecular weight standards, at least one primer, but 
mostly a number of primers is used, which are shifted against one another. The 
25 number of primers per elongator depends on the desired length intervals. If it is 
desired to use algomers of a length of 30bp and if the lengths of the generated DNA 
fragments should have a difference of lObp, then three primers, that are shifted exactly 
lObp with respect to one another, are used. 

For the synthesis of sequences for producing algomers and logomers, the 
30 requirements described for Step II of the method in accordance with the present 
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invention have to be fulfilled. In particular, because the partial sequences serving as 
the template have the same GC content, the NFR method ensures the use of nested 
primers and the balance of the AT/GC ratio of the generated DNA fragments. This last 
aspect is crucial, regarding the fact that the running behavior of the DNA fragments in 

5 the electrophoresis depends not only on length, but also on the configuration of the 
fragments. This is particularly true for high resolution molecular weight standards, in 
which there are only small length intervals. 

A further object of the present invention is thus the use of information-carrying 
polymers, in particular information-carrying polymers obtained in accordance with the 

10 present invention, for the manufacture of molecular weight standards. 

The above-described methods enable for example the manufacturing of a polymeric 
data storage 

The logomers explained above can be used as RAM (read and write) or ROM 
(read only) data storages with high capacity and low energy consumption In order to 
15 ensure the highest possible information density and to make the handling of the storage 
as easy as possible, the logomers are bonded to a solid carrier. This makes sure that 
the logomers can be addressed individually, i.e. written, read, and erased individually 
(see Fig. 9). 

The necessary write and read operations are carried out on the basis of the 
20 enzymatic reactions described for Step V of the present invention and in the section 
Reading of the information contained in the logomers, which are controlled by changing 
the temperature. The temperature changes can be carried out by laser or by heating 
and cooling the solid carrier, e.g. in a thermal cycler. 

To write, a modification of the above-described method of symbol 
25 polymerization is used. The polymerization is carried out on a solid carrier, and the 
algomers are concatenated with one another by repeated restriction-ligation cycles. 

In order to manufacture single, addressable storage cells, "anchor molecules" 
are irreversibly connected (by UV irradiation or in a furnace) to the carrier in certain 
intervals. Thus, logomers that can be removed in an erasure process can be written 
30 reversibly onto the carrier provided with such anchor molecules. 
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For the write process, first, specific algomers ("starter algomers") are bonded 
to the anchor molecules by hybridization. These starter algomers are the point of 
origin of a symbol polymerization, in which further algomers are concatenated with one 
another. Different from Step V of the method of the present invention, the 

5 concatenation of algomers is carried out by repeated cycles of restriction and ligations. 
In each cycle, the last algomer of a polymerizing chain is cut with a restriction enzyme, 
so that a single-stranded overhang sequence (elongation point) is created, which can 
then be bonded by ligation to the subsequent algomer. 

For this write process, the algomers must be provided with a specific design: 

10 a) Each algomer to be written has a single-stranded overhang sequence, with 
which it can bond to the elongation point of a logomer. 

b) An algomer that has only a single-stranded overhang sequence with which it 
can bond to the elongation point, but not a restriction cut site, by which it can be 
recognized by restriction enzymes and thus serve as an elongation point, is referred to as 

15 "end algomer" 

c) Every algomer to be written that is not an end algomer has a double-stranded 
end containing the recognition sequence for the selected restriction enzyme. At a 
restriction, the enzyme splits the algomer, so that a single-stranded overhang sequence 
is created, which then can serve as an elongation point. 

20 d) Every finished logomer has exactly one start algomer and one end algomer. 
Start and end algomers are, in accordance with the definition given above, terminators. 

e) The overhang sequences of the algomers are compatible to one corresponding 
selected restriction enzyme, so that an algomer to be concatenated can bond to the 
elongation point of a logomer and the elongation point created subsequently by 

25 restriction is identical to its predecessor. 

f) The sequence following the overhang sequence in the algomer is selected such 
that a concatenated algomer cannot be split off again by a subsequent restriction 
process. 

Suitable solid carriers include membranes, such as Gene Screen Plus (DuPont, 
30 Biotechnology Systems, Catalogue No.: NEF-986 or NEF-987), Hybond-N (Amersham 
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Life Sciences, Catalogue No.: RPN 82N or RPN 137N), silicon or silicate surfaces or 
glass. The anchor molecules are covalently bonded to the solid carrier (by UV 
irradiation or heat). 

In order to make the molecules individually addressable, they have to be 
5 arranged on the carrier in regular intervals. 

Suitable enzymes include commercially available enzymes, preferably 
enzymes that are not deactivated at 65°C (e.g. HindlU). 

The writing operations are based on the fact that the enzymes used for writing 
have different temperature optimums. For example, the temperature optimum of the 
10 ligase used for the concatenation of symbols is 16°C, whereas the necessitated 
restriction enzyme functions only at 37°C. Thus, the timed writing of symbols in 
cycles of restriction and ligation is possible. 

Since all read and write operations are carried out using enzymes, it is 
necessary to be able to adjust the temperature of the carrier. For this, all storage cells 
15 can be subjected simultaneously to the same operation by using a thermal cycler and 
material that can be manipulated thermally. Alternatively, storage cells can be 
accessed individually and independently from one another, if a correspondingly small 
write/read head is used. This write/read head must serve as a pipetting device for the 
necessitated molecules (enzymes, algomers) on the one hand, but must also generate the 
20 temperature needed for a certain manipulation (e.g. with a laser). 

In the erasure operation, a logomer is detached from its anchor molecule by 
denaturation. The anchor molecule is then available again for a write operation. For 
a read operation, a logomer is detached from an anchor molecule by denaturation. 
Then, it can be read with the methods of the present invention. 
25 The write, erase, and read operations described above are carried out by 

enzymes at various temperatures. For this, using a material that can be manipulated 
thermally, all storage cells can be subjected to the same operation. Alternatively, 
storage cells can be accessed individually and independently from one another, if a 
correspondingly small write/read head is used. This write/read head must serve as a 
30 pipetting device on the one hand, but must also generate the temperature needed for a 
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certain manipulation. 

The advantage of the polymer storage explained above is in its higher storage 
density (DNA molecules have a size of around 10" I0 m) as compared to conventional 
materials, and its much lower energy consumption (regarding the energy needed for 
5 enzymatic reactions, see [L.M. Adleman, Molecular Computation of Solutions to 
Combinatorial Problems, Science, 266, 1021-1024, (1994)]). Moreover, the 
environmental compatibility is improved considerably, because the polymers described 
herein are completely non-toxic. 

Compared to the aqueous solutions used conventionally in DNA computing, 
10 the approach described above has the advantage of higher storage densities and 
addressability of individual polymers. 

Thus, another object of the present invention is to provide a polymeric data 
storage containing information-carrying polymers according to the present invention, 
methods for linking molecules in accordance with the present invention, and methods 
15 for reading or methods for isolating and amplifying in accordance with the present 
invention. 

The above-described methods enable for example the manufacturing of a DNA 
computer 

Using the above-described information representation by algomers and 
20 logomers, it is possible to configure a DNA computer. The DNA computer includes 
(see Fig. 10): 

a) oligonucleotide synthesizers 

b) thermal cyclers 

c) pipetting device 

25 d) device for isolating polymers 

e) device for separating nucleic acids, such as an electrophoresis system 

f) scanner 

g) control computer (work station, e.g. PC) 

The DNA computer is configured as a hybrid system, in which simple, 
30 calculation-intensive algorithms or the storage of data are executed with DNA and the 




necessary devices (devices a) to e)), and a conventional computer (preferably a work 
station, e.g. a PC) which is used as host and control computer. 

The DNA computer contains at least one thermal cycler serving as an 
"information reactor," which is controlled with the control computer (e.g. MJ Research 

5 PTC-100 Eppendorf Mastercycler). In the tubes or microtiter plates of the thermal 
cycler, there are molecule mixtures of nucleic acids, enzymes and, further chemical 
substances (such as buffers and nucleotides) that are needed for the algomer assembly 
(Step IV of the method of the present invention) and the symbol polymerization (Step V 
of the method of the present invention). The DNA computer is programmed by the 

10 implementation of algorithms as grammars (Step I of the method of the present 
invention). The monomer sequences needed for the grammars are produced with the 
NFR method and using an oligonucleotide synthesizer (e.g. ABI 392, ABI 398, ABI 
3948 by Perkin-Elmer Applied Biosystems) (Steps II and m of the method of the 
present invention). These monomer sequences are introduced as input into the thermal 

15 cycler, in which the algomer assembly (Step IV of the method of the present invention) 
is carried out. 

The resulting algomers are introduced in one or more reaction chambers of the 
thermal cyclers, where they are linked to logomers (symbol polymerization, Step V of 
the method of the present invention). The resulting logomers are isolated and 

20 amplified in a separate device. This device can operate in accordance with the 
principles that are described for the method of the present invention. After isolation 
and amplification, the logomers can be read. For this, they are read out in accordance 
with the present invention in a thermal cycler. The nucleic acid fragments resulting 
from the readout process can be read by gel electrophoresis, as described above. Here, 

25 the gel electrophoresis device serves as the output device of the DNA computer, and the 
resulting band patterns are read into the control computer by a scanner. 

The control computer now undertakes further calculations and controls based 
on the obtained results, and, if necessary, causes further molecular reactions. For 
example, based on the obtained results, the control computer can cause the new 

30 implementation of grammars (manufacture of oligomers with the NFR processes). In 
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this manner, the information obtained by molecular processes as logomers can serve not 
only as passive data, but also as instructions (and addresses). Thus, a self-regulating 
process of different algorithms, which is necessary for a universal data processing, is 
possible. 

5 The DNA computer processes information as oligomers and polymers. 

Cloning vectors (plasmids) that are located in a liquid solution or on a solid, addressable 
carrier serve as storage cells of the data. 

The DNA computer as described above is programmable by grammars and can be 
controlled by a conventional computer, which functions as a control computer and host 
10 system. 

Since the DNA computer can carry out several operations in molar orders of 
magnitude (for example the symbol polymerization, as described in Step V of the 
method of the present invention, in 10 12 - 10 20 elementary operations per second), a 
possible application is the calculation of simple, calculation-intensive algorithms. 

15 However, the computer can also be used for other applications, such as the generation of 
certain, regular DNA structures, which are of interest in nano-technology, for example. 

A further object of the present invention is to provide a DNA computer 
including information-carrying polymers in accordance with the present invention. A 
further object of the present invention is to provide a DNA computer, in which a reading 

20 method in accordance with the present invention and/or an isolation and amplification 
method in accordance with the present invention are used. 

The above-described methods enable for example the manufacturing of smallest 
molecular structures with logomers 

It is possible to manufacture the smallest molecular structures (nanostructures) 

25 with the logomers described herein. Components of such structures are algomers, 
which can be assembled to ordered patterns of logomers. For example, various 
algomers can be doped with different impurity molecules (ligands), in order to obtain 
ordered, higher molecular patterns of the corresponding ligands. Herein, the logomers 
function as the "skeleton" of molecular components of ligands. 

30 For example, a binary logomer can contain an alternating pattern of conducting 
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and non-conducting ligands. By arranging many logomers on a solid carrier, it is 
possible to produce conductors in the nanometer order. 

But it is also possible to use, for example, biomolecules, antibodies, optically active 
molecules as ligands. 

5 Logomers can be used as an "intelligent" glue (see Fig. 18). For this, the 

hybridization of complementary nucleotides is taken advantage of. When logomers 
are applied to surfaces to be glued together, and the logomers are bonded for example 
covalently to the surfaces to be bonded, since only surfaces with complementary nucleic 
acids form hydrogen bonds then otherwise identical surfaces can adhere to one another 

10 if they have complementary sequences. The "intelligence" of the glue is thus 
grounded in the highly selective adhesion effect, which depends on the sequences used. 
It is also possible to precisely adjust the adhesion strength for sequences to be glued 
together: First, it is possible to adjust the adhesion strength of surfaces by adjusting 
the density of logomers on them, and second, it is possible to vary the melting point of 

15 the logomers via the AT/GC ratio, so that surfaces glued together loose their adhesion at 
different temperatures. Due to the fact that harmless, easily degradable biomolecules 
are used for the glue, it is also possible to use such glues for medical applications (e.g. 
in surgery, or microsurgery). Another application is the use of the above-described 
method in high-precision applications and authentication applications. 

20 If sequences for the binding of ligands are added to the algomers used to 

produce the logomers, then it is possible to attain a stronger adhesive effect. Such 
ligands can be, in the case of DNA, DNA-binding proteins, such as specific antibodies 
directed at the DNA-sequences. These can then be further bonded among one another 
with a further ligand for example (see Fig. 19). A simple, non-programmable adhesive 

25 effect can also be attained without logomers, if the proteins used (e.g. antibodies) can 
bond to the surface to be glued (such as antibodies, if the surface to be bonded is their 
antigen) (See Fig. 20). Simpler glues are possible by genetically producing proteins 
bonding specifically to certain surfaces. 

An alternative method is to connect the surface to be glued together not by 

30 weak interaction but by covalent binding. Different from the method described above, 




the adhesion force is then adjusted with the density of algomers. For this, the surface 
to be glued together are provided with algomers that carry sticky ends that are 
complementary to one another. These can then hybridize and be covalently bonded 
with each another by ligation. 

5 With logomers, it is possible to produce surfaces of very different structure and 

different physical properties. What these applications have in common, is that with the 
logomers, almost any desired pattern can be formed in the nanometer order, so that they 
can serve as the skeleton of the smallest molecular components. 

A further object of the present invention is thus the use of information-carrying 

10 polymers, in particular of information-carrying polymers obtained in accordance with 
the present invention, to produce or process smallest molecular structures, or as 
molecular-scale adhesive. 

The above-described methods enable for example the manufacturing of 
nanotechnological construction systems. 

15 With the NFR method and the "parallel extension" method described above, it 

is possible, for the first time, to translate larger grammars into molecules. These 
grammars are not limited to regular grammars. Due to the reusability of sequences, it 
is for example also possible to program ^-molecules [Matthias Scheffler, Axel 
Dorenbeck, Stefan Jordan, Michael Wustefeld, Giinter von Kiedrowski, Self-Assembly 

20 of Trisoligonucleotidyls: The Case for Nano-Acetylene and Nano-Cyclobutadiene, 
Angewandte Chemie Int. Ed, 38(22), 3312-3315, (1999)] and DX-molecules [Eric, 
Winfree, Furong Liu, Lisa A. Wenzler & Nadrian C. Seeman, Design and self-assembly 
of two-dimensional DNA crystals, Nature, 394, 539-544, (1998)]. 

A further object of the present invention is therefore the use of the 

25 NFR-method, the "parallel extension" method and the translation of grammars in 
molecules to produce components of nanotechnological construction systems. 
The above-described methods enable for example the controlled, programmable 
production of biologically active nucleic acids. 

For this, biologically active sequences such as restriction sites, recombination 

30 sites, centromers, promoters, exons, genes, etc. are used as terminals instead of artificial 
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sequences to represent symbols. The biologically active sequences can then be 
assembled in a controlled, programmable manner to biologically active constructs, such 
as genes or artificial chromosomes. 

A further object of the present invention is thus the use of information-carrying 
5 polymers, in particular information-carrying polymers in accordance with the present 
invention, for the controlled production of biologically active nucleic acids. 
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WE CLAIM: 

1 . A method for producing information-carrying polymers, comprising: 

L defining a regular grammar G = (X, V, R, S) with a finite terminal 

5 alphabet X, a finite set of variables V, a finite set of rules R, and a start 

symbol S; 

n. the NFR method (Niehaus-Feldkamp-Rauhe method) for producing 

monomer sequences; 
m. implementing, with the NFR method, a grammar defined in Step I, by 
10 producing with the NFR method monomer sequences that 

unambiguously represent the set of rules R of a grammar G; 
IV. assembling, from the monomer sequences manufactured in Step HI, 

for each rule of the set of rules R of G an oligomer representing that 

rule; 

15 v. linking the oligomers assembled in Step IV to information-carrying 

polymers. 

2. Method according to claim 1 , characterized in that the terminal alphabet £ of a 
grammar G contains terminals 0 and 1, n start terminals s (so, si, s n ) and m end 

20 terminals e (eo, ei, . . e m ), wherein n and m are integers larger than or equal to 0. 

3. Method according to claim 1 or 2, characterized in that the monomer sequence 
constructed in Step m includes nucleotides, in particular ribonucleotides, most 
preferably deoxyribonucleotides. 



25 



4. Method according to claim 3, characterized in that the monomer sequence 
constructed in Step m includes protein recognition sequences (such as restriction cut 
sites, protein binding sites or stop codons). 



30 5. Method according to any of the preceding claims, characterized in that the 
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synthesis of the monomer sequences in Step H and m is carried out in vitro, preferably 
with an oligonucleotide synthesizer. 

6. Method for isolating and amplifying information-carrying polymers that have 
been obtained in accordance with any of the preceding claims, characterized in that the 
information-carrying polymers obtained in Step V are ligated into cloning vectors, 
competent cells are transformed with these vectors, and the successfully transformed 
bacteria are selected according to selection markers. 

7. Method for reading information from information-carrying polymers that have 
been obtained or isolated and amplified in accordance with any of the preceding claims, 

characterized in that 

a) one pair of anti-sense primers each is mixed into at least n-1 solutions 
containing the information-carrying polymer, wherein n is the number 
of oligomers contained as elongators in the polymer; 

b) carrying out at least n-1 PCR approaches, wherein n is the number of 
oligomers contained as elongators in the polymer, and one primer of 
each pair primes in the terminator opposite to the elongator and the 
other primer primes in the elongator itself; 

c) the polymer fragments obtained by PCR are separated by length using 
electrophoresis; and 

d) the pattern obtained by electrophoresis is read out optically. 

8. Method according to claim 7, characterized in that the reading out in Step d) is 
performed automatically with a scanner or a sequencing machine. 

9. Information-carrying polymer obtained in accordance with one of the claims 1 
to 6. 

10. Random number generator comprising information-carrying polymers, in 




particular polymers in accordance with claim 9. 

1 1 . Polymeric data storage comprising polymers in accordance with claim 9. 
5 12. DNA-computer comprising polymers in accordance with claim 9. 

13. Biochip comprising polymers in accordance with claim 9. 

14. Use of information-carrying polymers in accordance with claim 9 to 
10 manufacture molecular weight standards. 

15. Use of information-carrying polymers in accordance with claim 9 to represent 
data structures. 

15 16. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9 as markers or signatures. 

17. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of quality assurance. 

20 

18. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of forgery protection. 

19. Use of information-carrying polymers, in particular of polymers in accordance 
25 with claim 9, for the purpose of labeling genetically engineered products. 

20. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of labeling food. 



30 21. 



Use of information-carrying polymers, in particular of polymers in accordance 




with claim 9, for the purpose labeling organisms. 

22. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of labeling chemical products. 

5 

23. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of labeling medical and pharmaceutical products. 

24. Use of information-carrying polymers, in particular of polymers in accordance 
10 with claim 9, for the purpose of labeling documents. 

25. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of labeling money. 

15 26. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of labeling objects and machinery. 



27. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, for the purpose of labeling liquids, solutions, suspensions or emulsions. 

20 

28. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, to encrypt information. 

29. Use of information-carrying polymers, in particular of polymers in accordance 
25 with claim 9, for the purpose of authenticating persons and objects. 

30. Use of information-carrying polymers, in particular of polymers in accordance 
with claim 9, as molecular-scale adhesives. 



30 31. Use of information-carrying polymers in accordance with claim 9 to 




manufacture or process smallest molecular structures. 



32. Use .of information-carrying polymers in accordance with claim 9 for quality 
control of synthetically produced oligonucleotides. 

5 

33. Use of information-carrying polymers in accordance with claim 9 to 
manufacture biochips. 

34. 1- to n-byte nucleic acids, in particular 1- to n-byte nucleic acids obtained by 
10 any of the claims 1 to 6. 

35. 1- to n-byte biochips, in particular 1- to n-byte biochips obtained by any of the 
claims 1 to 6. 



15 36. Use of biochips, in particular of biochips in accordance with any of the claims 
1 3, 33 and 35, as a data storage. 

37. Use of biochips, in particular of biochips in accordance with any of the claims 
13, 33 and 35, to manufacture optical display devices or display screens. 

20 

38. Use of information-carrying polymers, in particular polymers in accordance 
with claim 9, to label individual molecules, in particular nucleic acids, most preferably 
of genes. 

25 39. Use of nucleic acids to encrypt information. 



40. Use of nucleic acids to encrypt information, characterized in that for decryption, 
short nucleic acid sequences (primers) are used as the key. 



30 



41. Use of information-carrying polymers, in particular polymers in accordance 




with claim 9, to encrypt information, characterized in that the information-carrying 
polymer is concealed in a multitude of other polymers. 

42. Use of polymers for the purpose of labeling, characterized in that the polymers 
5 are encrypted. 

43. Manufacturing biologically active molecules with controlled, programmable 
linking of sequences. 

10 44. Manufacturing biologically active molecules by using algomers. 

45. Method for reading information from information-carrying polymers obtained 
or isolated and amplified in accordance with any of the claims 1 to 6, characterized in 
that biochips are used for the reading. 

15 

46. Use of the NFR method to manufacture biochips. 



47. Use of the NFR method to manufacture nano-technological components or 
components of nano-technological modular systems. 

20 




ABSTRACT 

The invention relates to methods for producing information-carrying polymers, 
information-carrying polymers obtained using these methods, and to methods for 
isolating, amplifying and selecting such information-carrying polymers. The invention 
5 also relates to polymeric data memories and DNA computer which contain the 
information-carrying polymers, and to the use of information-carrying polymers for 
producing molecular weight standards, as markers or signatures, for encrypting 
information, as molecular-scale adhesives, or for producing or processing smallest 
molecular structures. 
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R = {..., S-*OA, S->OB, S->OC, S-H)D, S-*OE, 

A-*1F, B->2G, C->3H, D-*4I, E-*5J, ...} 
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deqIaration and power of attorney 



As a below named inventor, I hereby declare that: 

My residence, post office address, and citizenship are as stated below next to my name. 

I believe I am the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if 
plural names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention 
entitled* 

INFORMATION-CARRYING AND INFORMATION-PROCESSING POLYMERS 

the specification of which (check one) 
I | is attached hereto 

was filed on September 28, 2001 as Application Serial No. 09/937,727 and was amended on September 28, 
2001 and November 8, 2001 (if applicable). 

I hereby state that I have reviewed and understand the contents of the above-identified specification, including the claims, as 
amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is known by me to be material to patentability as defined m Title 37, 
Code of Federal Regulations § 1.56. 

I hereby claim foreign priority benefits under Title 35, United States Code, § 1 19 of any foreign application(s) for patent or 
inventor's certificate listed below and have also identified below any foreign application for patent or inventor's certificate 
having a filing date before that of the application on which priority is claimed: 



PRIOR FOREIGN APPLICATION(S) 



NUMBER 


COUNTRY 


MONTH/DAY/YEAR FILED 


PRIORITY CLAIMED | 


199 14 808.2 


Germany 


03/31/1999 


Yes 


PCT/EPOO/02893 


PCT 


03/31/2000 


Yes 











I hereby claim the benefit under Title 35, United States Code, §120 of any United States application(s) listed below and 
insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application m 
the manner provided by the first paragraph Title 35, United States Code, §1 12, 1 acknowledge the duty to disclose 
information which is known by me to be material to patentability as defined in Title 37, Code of Federal Regulations § 1 .56 
which occurred between the filing date of the prior application and the national or PCT international filing date of this 
application: 



APPLICATION SERIAL 
NUMBER 


FILING DATE 


STATUS: PATENTED, PENDING, 
ABANDONED 





















I hereby appoint as my attorneys, with full powers of substitution and revocation, to prosecute this application and transact 
all business in the Patent and Trademark Office connected therewith: David L. TarnofF, Reg. No. 32,383 ; Yoshio Miyagawa, 
Reg. No. 43,393 ; Patrick A. Hilsmier Reg. No. 46.034 ; James W. Judge, Reg. No. 42.701, Todd Guise Reg. No.46,748. 



Send all correspondence to: SHINJYU GLOBAL IP COUNSELORS , LLP, 1233 Twentie th Street. NW, Suite 700, 
Washington, DC 20036 . Address all telephone communications to David L. Tarnoff at (202)-293-0444. 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information 
and belief are believed to be true; and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful false statements may jeopardize the validity of the application or any patent issued 
thereon. 



Full Name of First or Sole Inventor: 
Hilmar RAUHE 


SieBa^e^^FitsLor Sole Inwentor 


Date: 


Residence Address: S 
CTiitenberestrasse 29. D-45128 Essen, Germany 


Country of Citizenship: 
Germany 


Post Office Address: 

Same as above 1 



Signatures should confirm to names as typewritten. £3 Additional inventors on the attached Page 2. 
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pFull Name of Second Inventor: 
TIdo FELDKAMP 


Signature of Second Inventor 


Date: 


I Residence Address: ^ 

II TstenHnrfer Strasse 141. D-47057 Duisbure. Germany XD^x 


Country of Citizenship: 
Germany 


Post Office Address: 

II Same as above . — ^^=^^= 






FUli Name of Second Inventor: 
Wolfgang BANZHAF 


Signature of Second Inventor 


Date: 


Residence Address. 

Siftaenstrasse 104. D-44359 Dortmund, Germany lO^x 


Country of Citizenship: 
Germany 


Post Office Address: 
Same as above 






Full Name of Third Inventor: 1 Signature of Third Inventor 
Jonathan C. HOWARD 1 ' -XYVouJ ^ 


Date: II 


Residence Address: 

Heinetitrasse 19. D-50931 KoTn, Germany -TD&p^ 


Country of Citizenship: 
Germany 


Post Office Address: 

|| Same as above . — 1 
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Sequence Listing 



Sequence 

00 OOU tcgaggatatccgggcttcgggtgaccgggatcttaaga 
OOL agcttcttaagatcccggtcacccgaagcccggatatcc 

01 01U tcgaggatatcccttggcggtgaggcggcgatcttaaga 
OIL agcttcttaagatcgccgcctcaccgccaagggatatcc 

02 02U tcgaggatatctgtgggctacggggccgcgatcttaaga 
02L agcttcttaagatcgcggccccgtagcccacagatatcc 

03 03U tcgaggatatctaggtggctgcgcggggcgatcttaaga 
03L agcttcttaagatcgccccgcgcagccacctagatatcc 

50 SOU ttaagcgaagtggcctggcgggcgctagcg 
SOL aattcgctagcgcccgccaggccacttcgc 

51 S1U ttaagaacccgagccccggacgggctagcg 
S1L aattcgctagcccgtccggggctcgggttc 

52 S20 ttaagcgcaccaggcagagcgcggctagcg 
S2L aattcgctagccgcgctctgcctggtgcgc 

53 S3U ttaagtcctcgtggtcgggcggcgctagcg 
S3L aattcgctagcgccgcccgaccacgaggac 

EO EOU gatccccatggaatcccttggcggtgaggcggcgatatct 

EOL ctagagatatcgccgcctcaccgccaagggattccatggg 

El E1U gatccccatggaatctgtgggctacggggccgcgatatct 

E1L ctagagatatcgcggccccgtagcccacagattccatggg 

E2 E2U gatccccatggaatctaggtggctgcgcggggcgatatct 

E2L ctagagatatcgccccgcgcagccacctagattccatggg 

E3 E3U gatccccatggaatccgcaagccccaaggccccgatatct 

E3L ctagagatatcggggccttggggcttgcggattccatggg 

XO XOU ctagcgtatgccctgtagtttggagc 

XOL catggctccaaactacagggcatacg 

XI XltJ ctagcgtaaagccaactgtaccctgc 

X1L catggcagggtacagttggctttacg 

X2 X2U ctagcgagactaaggttgttggtcgc 

X2L catggcgaccaacaaccttagtctcg 

X3 X3U ctagcgttccttcaagtcactcgtcc ' 

X3L catgggacgagtgacttgaaggaacg 

X4 X4U ctagccctgcccacagtagaataagc 

X4L catggcttattctactgtgggcaggg 

X5 X5U ctagcgaatcaagactcgtgctaccc 

X5L catggggtagcacgagtcttgattcg 

X6 X6U ctagcgatgacagttagacgcttccc 

X6L catggggaagcgtctaactgt'catcg 

X7 X7U ctagcctcctgaacctaagtcatcgc 

X7L catggcgatgacttaggttcaggagg 

X8 X8U ctagcccaaagttacagacctcagcc 

X8L catgggctgaggtctgtaactttggg 
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X9 X9U 

X9L 
X10 XlOU 

XIOL 
Xll X11U 

X11L. 
X12 X12U 

X12L 
X13 X13U 

X13L 
X14 X14U 

X14L 
X15 X15U 

X15L 
X16 X16U 

X16L 
X17 X17U 

X17L 

xi8 xiau 

X18L 
X19 X19U 

X19L 
X20 X20U 

X20L 
X21 X21U 

X21L 
X22 X22U 

X22L 
X23 X23U 

X23L 
X24 X24U 

X24L 
X25 X25D 

X25L 
X2 6 X26U 

X2 6L 
X27 X27U 

X27L 
X28 X28U 

X28L 
X29 X29U 

X2 9L 
X30 X30U 

X30L 



ctagcgacctatgcaatctactcgcc 
catgggcgagtagattgcataggtcg 
ctagccagtagtaatcttccgcaggc 
catggcctgcggaagattactactgg 
ctagcctgttggtgggtactcctaac 
catggttaggagtacccaccaacagg 
ctagcaacgaggtggtacgacatagc 
catggctatgtcgtaccacctcgttg 
ctagccgcagtcattatagagagccc 
catggggctctctataatgactgcgg 
ctagcgtacttgtccctccagcatac 
catggtatgctggagggacaagtacg 
ctagcatcctgtgtgcctaagtacgc 
catggcgtacttaggcacacaggatg 
ctagcgaggcacttgacgtatgactc 
catggagtcatacgtcaagtgcctcg 
ctagcgggatacctttctaagcgacc 
catgggtcgcttagaaaggtatcccg 
ctagcggataggtcgcagtcttgtac 
catggtacaagactgcgacctatccg 
ctagcgtacagtgcgaagagggtaac 
catggttaccctcttcgcactgtacg 
ctagcgcctccgacacagttactaac 
catggttagtaactgtgtcggaggcg 
ctagcgtgggtctggaacactattgc 
catggcaatagtgttccagacccacg 
ctagcaactcctcactgttcactgcc 
catgggcagtgaacagtgaggagttg 
ctagccgtatggtatatcgtcagccc 
catggggctgacgatataccatacgg 
ctagcgcacagaccaaggacattacc 
catgggtaatgtccttggtctgtgcg 
ctagcctcccactaaacggtcctatc 
catggataggaccgtttagtgggagg 
ctagccacggaccgagtatctattgc 
catggcaatagatactcggtccgtgg 
ctagcgtctgatgagcgcagttagtc 
catggactaactgcgctcatcagacg 
ctagcgagaatgcggtgatagtgacc 
catgggtcactatcaccgcattctcg 
ctagcctacactaaccgacggaatgc 
catggcattccgtcggttagtgtagg 
ctagcaagtccttgttacgagtcccc 
catgggggactcgtaacaaggacttg 



X31 


X31U 


ctagcacctgaagttacgggaagtcc 




X31L 


catgggacttcccgtaacttcaggtg 


X32 


X32U 


ctagcaacaggttccttacagagggc 




X32L 


catggccctctgtaaggaacctgttg 


X33 


X33U 


ctagcacgtagccaagacttatcccc 




X33L 


catgggggataagtcttggctacgtg 


X34 


X34U 


ctagccagtcaggtctattgttgccc 




X34L 


catggggcaacaatagacctgactgg 


X35 


X35U 


ctagcgagaaccatattgcccctacc 




X35L 


catgggtaggggcaatatggttctcg 


X3 6 


X36U 


ctagccataaagaggagactgggctc 




X36L 


catggagcccagtctcctctttatgg 


X37 


X37U 


ctagccatagtctgcgaatactgccc 




X37L 


catggggcagtattcgcagactatgg 


X38 


X38U 


ctagcgttagggcagccttaggtaac 




X38L 


catggttacctaaggctgcccrtaacg 


X39 


X39U 


ctagcatagatgtgacagtgggacgc 




X3 9L 


catggcgtcccactgtcacatctatg 


X40 


X40U 


ctagcgattagagttgtagggcgctc 




X40L 


catggagcgccctacaactctaatcg 


X41 


X41D 


ctagctggaagtacgtgtgggatagc 




X41L 


catggctatcccacacgtacttccag 


X42 


X42U 


ctagctgagatagtcgtggatggtcc 




X42L 


catgggaccatccacgactatctcag 


X43 


X43U 


ctagcacgacctgttaccgttacctc 




X43L 


catggaggtaacggtaacaggtcgtg 


X44 


X44U 


ctagccattgctggaactgctactcc 




X44L 


catgggagtagcagttccagcaatgg 


X4 5 


X45U 


ctagcgtaccccacccacataaagtc 




X45L 


catggactttatgtgggtggggtacg 


X4 6 


X4 6U 


ctagcagtgtaggaatacccgtgctc 




X4 6L 


catggagcacgggtattcctacactg 


X47 


X47U 


ctagccgttcgctatgttctacagcc 




X47L 


catgggctgtagaacatagcgaacgg 


X48 


X48U 


ctagcccctgtccccatacattactc 




X48L 


catggagtaatgtatggggacagggg 


X4 9 


X49U 


ctagcatgtaggcaactgtcgtcacc 




X49L 


catgggtgacgacagttgcctacatg 


X50 


X50U 


ctagcgtatgttcccagcgagtacac 




X50L 


catggtgtactcgctgggaacatacg 


X51 


X51U 


ctagcgtgtaataccacgggtgttgc 




X51L 


catggcaacacccgtggtattacacg 


X52 


X52U 


ctagctgtagtggtctttacgtggcc 




- X52L 


catgggccacgtaaagaccactacag 
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X53' X53U ctagcgtcaggtgctcatccagttac 

X53L catggtaactggatgagcacctgacg 

X54 X54U ctagcctccacaaacatacccacacc 

X54L catgggtgtgggtatgtttgtggagg 

X55 X55U ctagctgaggtgtcaatgagagaggc 

X55L catggcctctctcattgacacctcag 

X5 6 X56U ctagctcactcctctacgatgaacgc 

X5 6L catggcgttcatcgtagaggagtgag 

X57 X57U ctagctcgtactgagatgttcctgcc 

X57L catgggcaggaacatctcagtacgag 

X58 X58U ctagctgactctggacttatgacgcc 

X58L catgggcgtcataagtccagagtcag 

X59 X5 9U ctagcagattacggagtgtgttcccc 

X5 9L catgggggaacacactccgtaatctg 

X60 X60U ctagccttaccactgttgcagaggac 

X60L catggtcctctgcaacagtggtaagg 

X61 X61U ctagcccgttatccgtactggaagtc 

X61L catggacttccagtacggataacggg 

X62 X62U ctagcgtttgggtatcggctgtactc 

X62L catggagtacagccgatacccaaacg 
X63 X63U ctagcgaggatgaacacaaggaggtc 
X63L catggacctccttgtgttcatcctcg 
X64 X64U ctagccatagcggatagggttacgtc 
X64L catggacgtaaccctatccgctatgg 
X65 X65U ctagctctcatatccctaacctggcc 
X65L catgggccaggttagggatatgagag 
X66 X66U ctagcggtattgtctcgtcatgcacc 
X66L catgggtgcatgacgagacaataccg 
X67 X67U ctagcatcacctacctgagattgccc 
X67L catggggcaatctcaggtaggtgatg 
X68 X68O ctagcgactggggtctgttgtttctc 
X68L catggagaaacaacagaccccagtcg 
X69 X69U ctagcaggggtcggagttagagaatc 
X69L catggattctctaactccgacccctg 
X70 X70U ctagcaccgaaggatgttgagtctcc 
X70L catgggagactcaacatccttcggtg 
X71 X71U ctagcggtgatgactcggaacttctc 
X71L catggagaagttccgagtcatcaccg 
X72 X72U ctagcgggtacacaaatggaaggtcc 
X72L catgggaccttccatttgtgtacccg 
X73 X73U ctagcatcatctcgggtctacttcgc 
X73L catggcgaagtagacccgagatgatg 
X7 4 X7 4U ctagcaatcagtagacatcgccctcc 
X7 4L catgggagggcgatgtctactgattg 
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X75 X75U ctagccacggaagtctattcacacgc 

X7 5L catggcgtgtgaatagacttccgtgg 

X7 6 X7 6U ctagcaggggctgttacatgaagagc 

X7 6L catggctcttcatgtaacagcccctg 

X77 X77U ctagccgtgacctgctgtgatttacc 

X77L catgggtaaatcacagcaggtcacgg 

X78 X78U ctagcatagtcttagtccggcatcgc 

X78L catggcgatgccggactaagactatg 

X7 9 X79U ctagcatcactacctgcctattcgcc 

X7 9L catgggcgaataggcaggtagtgatg 

X80 X80U ctagcttaccttgtctaaccctcgcc 

X80L catgggcgagggttagacaaggtaag 

X81 X81U ctagctcttaatccctctggctgacc 

X81L catgggtcagccagagggattaagag 

X82 X82U ctagcagacccatccgagtaccatac 

X82L catggtatggtactcggatgggtctg 

X83 X83U ctagcagactgaaccctcaagatgcc 

X83L catgggcatcttgagggttcagtctg 

X84 X84U ctagcaggcattctgatactcctcgc 

X84L catggcgaggagtatcagaatgcctg 

X85 X85U ctagcctgcaagagtcatatcacggc 

X85L catggccgtgatatgactcttgcagg 

X86 X8 6U ctagcgtgggatttcgtctataccgc 

X8 6L catggcggtatagacgaaatcccacg 

X87 X87U ctagccagaccgcctcctaatcttac 

X87L catggtaagattaggaggcggtctgg 

X8 8 X8 8U ctagcatctccttgcttgtacctcgc 

X88L catggcgaggtacaagcaaggagatg 

X8 9 X8 9U ctagctgtgcggctatgtagtcctac 

X8 9L catggtaggactacatagccgcacag 

X90 X90U ctagcagtgtcctaagcaaagacggc 

X90L catggccgtctttgcttaggacactg 

X91 X91U ctagcttcccttctaccctgtaccac 

X91L catggtggtacagggtagaagggaag 

X92 X92U ctagcaaggtgtccagtggtcagatc 

X92L catggatctgaccactggacaccttg 

X93 X93U ctagcatcagcatagggaccacactc 

X93L catggagtgtggtccctatgctgatg 

X94 X94U ctagcggacggccaactatcataagc 

X94L catggcttatgatagttggccgtccg 

X95 X95U ctagctgaacctctgtgctgtaggac 

X95L catggtcctacagcacagaggttcag 

X96 X96U ctagctgacttcacctggctgtctac 

X96L catggtagacagccaggtgaagtcag 



X97 X97U ctagctcctaactctattgccgagcc 

X97L catgggctcggcaatagagttaggag 

X98 X98U ctagccttatcctacgcgagcagatc 

X98L catggatctgctcgcgtaggataagg 

X99 X99U ctagccgcataa'gcacatctgtagcc 

X99L catgggctacagatgtgcttatgcgg 

X100 X100U ctagcaacgtagagaacgaagccagc 

X100L catggctggcttcgttctctacgttg 

X101 X101U ctagctcacggagtcctgatgtatgc 

X101L catggcatacatcaggactccgtgag 

XI 02. X102U ctagcatgtatcccgttggtgtaggc 

X102L catggcctacaccaacgggatacatg 

X103 X103U ctagcgcctgactattttgacggtcc 

X103L catgggaccgtcaaaatagtcaggcg 

XI 04 XI 040 ctagcgatagcacttcaaacgtcccc 

X104L catgggggacgtttgaagtgctatcg 

X105 X105U ctagcgaagtgcattcttaccgtccc 

X105L catggggacggtaagaatgcacttcg 

X106 X106U ctagcccttgtcgttctaaactgggc 

X106L catggcccagtttagaacgacaaggg 

X107 X107U ctagcgtttaaggagtgttcgtccgc 

X107L catggcggacgaacactccttaaacg 

X108 X108U ctagccagagttaaagttgccccacc 

X108L catgggtggggcaactttaactctgg 
X109 X109U ' ctagcgtcagaatcaatctcctggcc 

X109L catgggccaggagattgattctgacg 

XI 10 X110U ctagcgtgcttgacctgaatccttcc 

X110L catgggaaggattcaggtcaagcacg 

XI 11 X111U ctagcaccacccttcttcaccctatc 

X111L catggatagggtgaagaagggtggtg 

X112 X112U ctagccagaaagggcagtcctaatgc 

X112L catggcattaggactgccctttctgg 

X113 X113U ctagcaacccactatcgtacatccgc 

X113L catggcggatgtacgatagtgggttg 

X114 X114U ctagcgaatacttgggagttgcgagc 

X114L catggctcgcaactcccaagtattcg 

X115 X115U ctagcggcaataagttcaggctgtcc 

X115L catgggacagcctgaacttattgccg 

X116 X116U ctagcaagggagcagacagtcatgizc 

X116L catggacatgactgtctgctcccttg 

X117 X117U ctagccataacacacgggacaacacc 

X117L catgggtgttgtcccgtgtgttatgg 

X118 X118U ctagccagaaacgaaggtcctaacgc 

X118L catggcgttaggaccttcgtttctgg 
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X119 X119U ctagcgtgccaagccttaatagtgcc 

X119L catgggcactattaaggcttggcacg 

X120 X120U ctagccggatgaggatgtatcaggtc 

X120L catggacctgatacatcctcatccgg 

X121 X121U ctagccgtggcgttaaactcttaggc 

X121L catggcctaagagtttaacgccacgg 

X122 X122U ctagcctgctcttgatggataaggcc 

X122L catgggccttatccatcaagagcagg 

X123 X123U ctagcatcagagtggagagcacgatc 

X123L catggatcgtgctctccactctgatg 

X12 4 X124U ctagcgcacaggaaatagaagtcgcc 

X124L catgggcgacttctatttcctgtgcg 

X125 X125U ctagcccacccgactacagaacaatc 

X125L catggattgttctgtagtcgggtggg 

X126 X12 6U ctagccgagattacccagttgtggtc 

X12 6L catggaccacaactgggtaatctcgg 

X127 X127U ctagccgagttaatagacgcggaagc 

X127L catggcttccgcgtctattaactcgg 

X128 X128U ctagccgctgcggtttctctattacc 

X128L catgggtaatagagaaaccgcagcgg 

X129 X129U ctagctcagccgtaggtcttcaactc 

X12 9L catggagttgaagacctacggctgag 

X130 X130U ctagcttagtggagcctcttcacgtc 

X130L catggacgtgaagaggctccactaag 

XI 31 X131U ctagccagagtgtcggccttttaagc 

X131L catggcttaaaaggccgacactctgg 

X132 X132U ctagcctaaaacttgactcgcggacc 

X132L catgggtccgcgagtcaagttttagg 

X133 X133U ctagcccgtctggtcggataagatac 

X133L catggtatcttatccgaccagacggg 

XI 34 X134U ctagcagattggtcacaactccaggc 

X134L catggcctggagttgtgaccaatctg 

XI 35 X135U ctagcaccagaaggttggagaaggtc 

X135L catggacct-tctccaaccttctggtg 

X136 X136U ctagccacatcttacaccgtcatcgc 

X136L catggcgatgacggtgtaagatgtgg 

X137 X137U ctagcgcgtcgggacttacaagatac 

X137L catggtatcttgtaagtcccgacgcg 

XI 38 X138U ctagcaggatggtgcaagcatactcc 

X138L catgggagtatgcttgcaccatcctg 

X139 X139U ctagcggacaaactgctggttcctac 



XI 4 0 X140U ctagcggttgcctcaaactggagtac 
X140L catggtactccagtttgaggcaaccg 



X139L 



catggtaggaaccagcagtttgtccg 
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X141 X141U ctagctttgtaagacaccccagagcc 

X141L catgggctctggggtgtcttacaaag 

X14 2 X142U ctagcccctttggtcgttagtgctac 

X142L catggtagcactaacgaccaaagggg 

XI 4 3 X14 3U ctagcagagggctgagggatagaaac 

X14 3L catggtttctatccctcagccctctg 

XI 4 4 X14 4U ctagcagttcgtacctttgacacgcc 

X14 4L catgggcgtgtcaaaggtacgaactg 

X14 5 X14 5U ctagcgatttgcagtgataccacccc 

XI 4 5L catgggggtggtat cactgcaaat eg 

X14 6 X14 6U ctagcgagtgtagaatccggctgaac 

X14 6L catggttcagccggattctacactcg 

X147 X147U ctagcgcaagtccgtgttcctcttac 

X14 7L catggtaagaggaacacggacttgcg 

XI 4 8 X14 9U ctagcctttgtacggtggtgaaaccc 

X14 8L catggggtttcaccaccgtacaaagg 

X149 X14 9U ctagcactttcatgtggagacgctcc 

X14 9L catgggagcgtctccacatgaaagtg 

XI 50 XI SOU ctagcaacactaagcggatgtcagcc 

XI SOL catgggctgacatccgcttagtgttg 

X151 X151U ctagcctaactcaactccgcaacgtc 
X151L catggacgttgcggagttgagttagg 
X152 X152U ctagcaagtagatggaaagggtgccc 
X152L catggggcaccctttccatctacttg 
X153 X153U ctagcgaaatacgaggaacgagggtc 
X153L catggaccctcgttcctcgtatttcg 
X154 X154U ctagcgaggctgctaagggaaaaagc 
X154L catggctttttcccttagcagcctcg 
X155 X155U ctagcggacgttatttcacacctcgc 
X155L catggcgaggtgtgaaataacgtccg 
X156 X156U ctagcgacacaagagcctaagccaac 
X15 6L catggttggcttaggctcttgtgtcg 
X157 X157U ctagcgtgtttgtttaccctgccagc 



X158 X158U ctagcactgcgtcctcacaaagaagc 

X158L catggcttctttgtgaggacgcagtg 

X159 ^ X159U ctagcagtccccgaagtcaaatagcc 

X159L catgggctatttgacttcggggactg 

XI 60 X160U ctagcgatacttcccaatgcagacgc 

X160L catggcgtctgcattgggaagtatcg 

XI 61 X161U ctagcggaactaatgtcatgctgccc 

X161L catggggcagcatgacattagttccg 

XI 62 X162U ctagcgcgatggtgataacgtaagcc 

X162L catgggcttacgttatcaccatcgcg 



X157L 



catggctggcagggtaaacaaacacg 



X163 X163U ctagcctcctgtacggcaaatgttcc 

X163L catgggaacatttgccgtacaggagg 

X164 X164U ctagcctatttggaccgcacaagacc 

X164L catgggtcttgtgcggtccaaatagg 

X165 X165U ctagcatccttgtagcggacacgtac 

X165L catggtacgtgtccgctacaaggatg 

X166 X166U ctagcaagtgggaatagggtccgtac 

X166L catggtacggaccctattcccacttg 

XI 67 X167U ctagcgatagaaccgactgcaatgcc 

X167L catgggcattgcagtcggttctatcg 

XI 68 X168U ctagcgcgaatgaagtggactatgcc 

X168L catgggcatagtccacttcattcgcg 

XI 6 9 X169U ctagccgaactttagcatccgtgacc 

X169L catgggtcacggatgctaaagttcgg 

X170 X170U ctagcgttagaaatgtctgcggtcgc 

X170L catggcgaccgcagacatttctaacg 

XI 71 X171U ctagcgcaccacacgaaagaatctcc 

X171L catgggagattctttcgtgtggtgcg 

X172 X172U ctagctcacactcttggtcggctatc 

X17 2L catggatagccgaccaagagtgtgag 

XI 7 3 X173U ctagctcactctggccgaactgtatc 

X17 3L catggatacagttcggccagagtgag 

X174 X174U ctagctggagaacggtcttgtctgtc 

X17 4L catggacagacaagaccgttctccag 

X175 X175U ctagcaggtatcgcgtccctaatgtc 

X175L catggacattagggacgcgatacctg 

X17 6 X176U ctagctggctcactacacgaacatgc 

X17 6L catggcatgttcgtgtagtgagccag 

X177 X177U ctagcgtaccagtgaaatcgggacac 

X177L catggtgtcccgatttcactggtacg 

X178 X178U ctagccgaagaacaacatcagagccc 

X178L catggggctctgatgttgttcttcgg 

X179 X179U ctagcgggtccaatcctaaagcaagc 

X179L catggcttgctttaggattggacccg 

X180 X180U ctagcttgagcagaagaactctcgcc 

X180L catgggcgagagttcttctgctcaag 

X181 X181U ctagcattgccgtggtgtaactacgc 

X181L catggcgtagttacaccacggcaatg 

X182 X182U ctagccgcatacgttctgtgcagtac 

X182L catggtactgcacagaacgtatgcgg 

X183 X183U ctagccagcgatatggtggtgctatc ' 

X183L catggatagcaccaccatatcgctgg 

X184 X184U ctagccgtaacacccagagattggac 

X18 4L catggtccaatctctgggtgttacgg 
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X185 X185U 
X185L 
X186 X186U 
X186L 
X187 X187U 

X187L 
X188 X188U 

X188L 
X189 X189U 

X18 9L 
X190 X190U 

X190L 
X191 X191U 

X191L 
X192 X192U 

X192L 
X193 X193U 

X193L 
X194 X194U 

X194L 
X195 X195U 

X195L 
X196 X196U 

X196L 
X197 X197U 

X197L 
X198 X198U 

X198L 
X199 X199U 

X199L 
X200 X200U 

X200L 
X201 X201U 

X201L 
X202 X202U 

X202L 
X203 X203U 

X203L 
X204 X204U 

X204L 
X205 X205U 

X205L 
X206 X206U 

X2 06L 



ctagctccacgataactcacaagggc 

catggcccttgtgagttatcgtggag 

ctagctcacttccctggtcgtaatgc 

catggcattacgaccagggaagtgag 

ctagcgcgttgtcacctaatacggac 

catggtccgtattaggtgacaacgcg 

ctagcatgttacactcgcatccgacc 

catgggtcggatgcgagtgtaacatg 

ctagcatgtagccagtgattgtgccc 

catggggcacaatcactggctacatg 

ctagcaagatgcagagtctaccgcac 

catggtgcggtagactctgcatcttg 

ctagcagaccgctctatcacaagcac 

catggtgcttgtgatagagcggtctg 

ctagcagggaggtccagaatcttcac 

catggtgaagattctggacctccctg 

ctagcttagaccagcactgcgaagtc 

catggacttcgcagtgctggtctaag 

ctagcccgaggcgtgactatacaaac 

catggtttgtatagtcacgcctcggg 

ctagcgacaacctacaaacgctccac 

catggtggagcgtttgtaggttgtcg 

ctagccgtcattcgtcagaccagatc 

catggatctggtctgacgaatgacgg 

ctagcagatgagtgacgatggttgcc 

catgggcaaccatcgtcactcatctg 

ctagcacccacaaccaggaaaagtcc 

catgggacttttcctggttgtgggtg 

ctagctgaggcatctttggagtacgc 

catggcgtactccaaagatgcctcag 

ctagctgtaatcggtctcaggcaagc 

catggcttgcctgagaccgattacag 

ctagcctacagtcggattggctcaac 

catggttgagccaatccgactgtagg 

ctagccttcaaggacctcgtgcatac 

catggtatgcacgaggtccttgaagg 

ctagcggcaggataaagtgctgacac 

catggtgtcagcactttatcctgccg 

ctagcgcctggacggttaaaggttac 

catggtaacctttaaccgtccaggcg 

ctagctgtccaagtaccaaagagcgc 

catggcgctctttggtacttggacag 

ctagctgcacactgggtctaacacac 

catggtgtgttagacccagtgtgcag 
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X207 X207U ctagcaatctgtcgcctgcatagtgc 
X207L catggcactatgcaggcgacagattg 
X208 X208U ctagccccttcgctcttctttcatcc 
X208L catgggatgaaagaagagcgaagggg 
X209 X209U ctagcaagatattctcccacgaccgc 
X209L catggcggtcgtgggagaatatcttg 
X210 X210U ctagcggtttctgcactacagccaac 
X2 10L catggttggctgtagt gcagaaaccg 
X211 X211U ctagcaaaggcatcggacatcctacc 
X211L catgggtaggatgtccgatgcctttg 
X2 12 X212U ct agcacgaagtatgcgttgt gagcc 
X212L catgggctcacaacgcatacttcgtg 
X213 X213U ctagcggacatcggatttggataccc 
X213L catggggtatccaaatzccgatgtccg 
X214 X214U ctagccggtacttttgcgactctgac 
X214L catggtcagagtcgcaaaagtaccgg 
X215 X215U ctagcgttaccgctcaccgaaatctc 
X215L catggagatttcggtgagcggtaacg 
X216 X216U ctagcaagaagacgatgtgtgctcgc 
X216L catggcgagcacacatcgtcttcttg 
X217 X217U ctagcggctattcattctcgccctac 
X217L catggtagggcgagaatgaatagccg 
X218 X218U " ctagcaagggaatgtcagttctgccc 
X218L catggggcagaactgacattcccttg 
X219 X219U ctagcggcttgttaggccgttaagac 
X219L catggtcttaacggcctaacaagccg 
X220 X220U ctagcaaggatagagatgccgttgcc 
X220L catgggcaacggcatctctatccttg 
X221 X221U ctagcagcccgacatgacagatacac 
X221L catggtgtatctgtcatgtcgggctg 
X222 X222U ctagcttggagagactttgctgacgc 
X222L catggcgtcagcaaagtctctccaag 
X223 X223U ctagctgcgtgtacgggcttatagac 
X223L catggtctataagcccgtacacgcag 
X224 X224U ctagctcttattcgttagccgaggcc 
X22 4L catgggcctcggctaacgaataagag 
X225 X225U ctagcagccgaacaggacaaaactcc 
X225L catgggagttttgtcctgttcggctg 
X226 X226U ctagccatcattaacttcccctggcc 
X2 2 6L catgggccaggggaagttaatgatgg 
X227 X227U ctagcccgaactaaatgtcacagcgc 
X227L catggcgctgtgacatttagttcggg 
X228 X228U ctagcgacgcaatcttaaccaacccc 
X228L catgggggttggttaagattgcgtcg 
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X229 X229U 
X229L 
X230 X230U 
X230L 
X231 X231U 
X231L 
X232 X232U 
X232L 
X233 X233U 
X233L 
X234 X234U 
X234L 
X235 X235U 
X235L 
X236 X236U 
X236L 
X237 X237U 
* X237L 
X238 X238U 
X238L 
X239 X239U 
X239L 
X240 X240U 
X240L 
X241 X241U 
X241L 
X242 X242U 
X242L 
X243 X243U 
X243L 
X244 X244U 
X244L 
X245 X245U 
X245L 
X246 . X246U 
X24 6L 
X247 X247U 
X247L 
X248 X248U 
X248L 
X249 X249U 
X24 9L 
X250 X250U 
X250L 



ctagctgtatgcacatcaccgagtgc 

catggcactcggtgatgtgcatacag 

ctagccgtctcattctggtcattgcc 

catgggcaatgaccagaatgagacgg 

ctagcaggttgggttcccctatcatc 

catggatgataggggaacccaacctg 

ctagccggttgcttatcaggaatccc 

catggggattcctgataagcaaccgg 

ctagcacggccagtaagtgtccaatc 

catggattggacacttactggccgtg 

ctagcacacctgggcgtgaataactc 

catggagttattcacgcccaggtgtg 

ctagccttgggctcctgctaaaagac 

catggtcttttagcaggagcccaagg 

ctagcgcgacctctaagccttgaaac 

catggtttcaaggcttagaggt cgcg 

ctagccaggtttgggggtacatgatc 

catggatcatgtacccccaaacctgg 

ctagctcgttgtctgccctcttgtac 

catggtacaagagggcagacaacgag 

ctagcagacctcggtaaccacgaaac 

catggtttcgtggttaccgaggtctg 

ctagcagtaaggtctttgggtgccac 

catggtggcacccaaagaccttactg 

ctagcagtagccttgtgggaaccaac 

catggttggttcccacaaggctactg 

ctagcgattaacctctcggaatgcgc 

catggcgcattccgagaggttaatcg 

ctagcgcgatattgttccctgcttcc 

catgggaagcagggaacaatatcgcg 

ctagccaaccgtgttagcgaccatac 

catggtatggtcgctaacacggttgg 

ctagcgaatgggtggtggaaggatac 

catggtatccttccaccacccattcg 

ctagcaccaagtcgctgtcaactgac 

catggtcagttgacagcgacttggtg 

ctagccattttcaggaaggagacggc 

catggccgtctccttcctgaaaatgg 

ctagctggttggactgatacgaacgc 

catggcgttcgtatcagtccaaccag 

ctagcattcacccaatggtctacggc 

catggccgtagaccattgggtgaatg 

ctagcggtggtaggcatatccgaaac 

catggtttcggatatgcctaccaccg 
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X251 X251U ctagctgtctgtgaccttgggattgc 

X251L catggcaatcccaaggtcacagacag 

X252 X252U ctagccaagcctcaatgcctaaaggc 

X252L catggcctttaggcattgaggcttgg 

X253 X253U ctagcacttcttacgtcaccgcgatc 

X253L catggatcgcggtgacgtaagaagtg 

X254 X254U ctagcggttcctcaccattcggtt ac 

X254L catggtaaccgaatggtgaggaaccg 

X255 X255U ctagcagagtcaatgtcggttggctc 

X255L catggagccaaccgacattgactctg 
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Name Algomer 

00 tcgaggatatccgggcttcgggtgaccgggatcttaaga 

cctataggcccgaagcccactggccctagaattcttcga 

01 tcgaggatatcccttggcggtgaggcggcgatcttaaga 

cctatagggaaccgccactccgccgctagaattcttcga 

02 tcgaggatatctgtgggctacggggccgcgatcttaaga 

cctatagacacccgatgccccggcgctagaattcttcga 

03 tcgaggatatctaggtggctgcgcggggcgatcttaaga 

cctatagatccaccgacgcgccccgctagaattcttcga 

50 ttaagcgaagtggcctggcgggcgctagcg 

cgcttcaccggaccgcccgcgatcgcttaa ' 

51 ttaagaacccgagccccggacgggctagcg 

cttgggctcggggcctgcccgatcgcttaa 

52 ttaagcgcaccaggcagagcgcggctagcg 

cgcgtiggtccgtctcgcgccgatcgcttaa 

53 ttaagtcctcgtggtcgggcggcgctagcg 

caggagcaccagcccgccgcgatcgcttaa 
EO gatccccatggaatcccttggcggtgaggcggcgatatct 

gggtaccttagggaaccgccactccgccgctatagagatc 
El' gatccccatggaatctgtgggctacggggccgcgatatct 

gggtaccttagacacccgatgccccggcgctatagagatc 
E2 gatccccatggaatctaggtggctgcgcggggcgatatct 

gggtaccttagatccaccgacgcgccccgctatagagatc 
E3 gatccccatggaatccgcaagccccaaggccccgatatct 

gggtaccttaggcgttcggggttccggggctatagagatc 
XO ctagcgtatgccctgtagtttggagc 

gcatacgggacatcaaacctcggtac 
XI ctagcgtaaagccaactgtaccctgc 

gcatttcggttgacatgggacggtac 
X2 ctagcgagactaaggttgttggtcgc 

gctctgattccaacaaccagcggtac 
X3 ctagcgttccttcaagtcactcgtcc 

gcaaggaagttcagtgagcagggtac 
X4 ctagccctgcccacagtagaataagc 

gggacgggtgtcatcttattcggtac 
X5 ctagcgaatcaagactcgtgctaccc 

gcttagttctgagcacgatggggtac 
X6 ctagcgatgacagttagacgcttccc 

gctactgtcaatctgcgaaggggtac 
X7 ctagcctcctgaacctaagtcatcgc 

ggaggacttggattcagtagcggtac 
X8 ctagcccaaagttacagacctcagcc 

gggtttcaatgtctggagtcgggtac 
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X9 ctagcgacctatgcaatctactcgcc 

x gctggatacgttagatgagcgggtac 
X10 ctagccagtagtaatcttccgcaggc 

ggtcatcattagaaggcgtccggtac 
Xll ctagcctgttggtgggtactcctaac 

ggacaaccacccatgaggattggtac 
X12 ctagcaacgaggtggtacgacatagc 

gttgctccaccatgctgtatcggtac 
X13 ctagccgcagtcattatagagagccc 

ggcgtcagtaatatctctcggggtac 
X14 ctagcgtacttgtccctccagcatac 

gcatgaacagggaggtcgtatggtac 
X15 ctagcatcctgtgtgcctaagtacgc 

gtaggacacacggattcatgcggtac 
XI 6 ctagcgaggcacttgacgtatgactc 

gctccgtgaactgcatactgaggtac 
X17 ctagcgggatacctttctaagcgacc 

gccctatggaaagattcgctgggtac 
X18 ctagcggataggtcgcagtcttgtac 

gcctatccagcgtcagaacatggtac 
XI 9 ctagcgtacagtgcgaagagggtaac 

gcatgtcacgcttctcccattggtac 
X20 ctagcgcctccgacacagttactaac 

gcggaggctgtgtcaatgattggtac 
X21 ctagcgtgggtctggaacactattgc 

gcacccagaccttgtgataacggtac 
X22 ctagcaactcctcactgttcactgcc 

gttgaggagtgacaagtgacgggtac 
X23 ctagccgtatggtatatcgtcagccc 

ggcataccatatagcagtcggggtac 
X24 ctagcgcacagaccaaggacattacc 

gcgtgtctggttcctgtaatgggtac 
X25 ctagcctcccactaaacggtcctatc 

ggagggtgatttgccaggataggtac 
X26 ctagccacggaccgagtatctattgc 

ggtgcctggctcatagataacggtac 
X27 ctagcgtctgatgagcgcagttagtc 

gcagactactcgcgtcaatcaggtac 
X28 ctagcgagaatgcggtgatagtgacc 

gctcttacgccactatcactgggtac 
X29 ctagcctacactaaccgacggaatgc 

ggatgtgattggctgccttacggtac 
X30 ctagcaagtccttgttacgagtcccc 

gttcaggaacaatgctcagggggtac 
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X31 ctagcacctgaagttacgggaagtcc 

gtggacttcaatgcccttcagggtac 
X32 ctagcaacaggttccttacagagggc 

gttgtccaaggaatgtctcccggtac 
X33 ctagcacgtagccaagacttatcccc 

gtgcatcggttctgaatagggggtac 
X34 ctagccagtcaggtctattgttgccc 

ggtcagtccagataacaacggggtac 
X35 ctagcgagaaccatattgcccctacc 

gctcttggtataacggggatgggtac 
X36 ctagccataaagaggagactgggctc 

ggtatttctcctctgacccgaggtac 
X37 ctagccatagtctgcgaatactgccc 

ggtatcagacgcttatgacggggtacT 
X38 ctagcgttagggcagccttaggtaac 

gcaatcccgtcggaatccattggtac 
X39 ctagcatagatgtgacagtgggacgc 

gtatctacactgtcaccctgcggtac 
X4 0 ctagcgattagagttgtagggcgctc 

gctaatctcaacatcccgcgaggtac 
X4 1 ctagctggaagtacgtgtgggatagc 

gaccttcatgcacaccctatcggtac 
X42 ctagctgagatagtcgtggatggtcc 

gactctatcagcacctaccagggtac 
X43 ctagcacgacctgt'taccgttacctc 

gtgctggacaatggcaatggaggtac 
X4 4 ctagccattgctggaactgctactcc 

ggtaacgaccttgacgatgagggtac 
X45 ctagcgtaccccacccacataaagtc 

gcatggggtgggtgtatttcaggtac 
X4 6 ctagcagtgtaggaatacccgtgctc 

gtcacatccttatgggcacgaggtac 
X47 ctagccgttcgctatgttctacagcc 

ggcaagcgatacaagatgtcgggtac 
X4 8 ctagcccctgtccccatacattactc 

ggggacaggggtatgtaatgaggtac 
X4 9 ctagcatgtaggcaactgtcgtcacc 

gtacatccgttgacagcagtgggtac 
X50 ctagcgtatgttcccagcgagtacac 

gcatacaagggtcgctcatgtggtac 
X51 ctagcgtgtaataccacgggtgttgc 

gcacattatggtgcccacaacggtac 
X52 ctagctgtagtggtctttacgtggcc 

gacatcaccagaaatgcaccgggtac 
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X53 

X54 

X55 

X56 

X57 

X58 

X5 9 

X60 

X61 

X62 

X63 

X64 

X65 

X66 

X67 

X68 

X69 

X70 

X71 

X7 2 

X73 

X7 4 



ctagcgtcaggtgctcatccagttac 

gcagtccacgagtaggtcaatggtac 
ctagcctccacaaacatacccacacc 

ggaggtgtttgtatgggtgtgggtac 
ctagctgaggtgtcaatgagagaggc 

gactccacagttactctctccggtac 
ctagctcactcctctacgatgaacgc 

gagtgaggagatgctacttgcggtac 
ct age tcgt act gaga tgttcctgcc 

gagcatgactctacaaggacgggtac 
ctagctgactctggacttatgacgcc 

gactgagacctgaatactgcgggtac 
ctagcagattacggagtgtgttcccc 

gtctaatgcctcacacaagggggtac 
ctagccttaccactgttgcagaggac 

ggaatggtgacaacgtctcctggtac 
ctagcccgttatccgtactggaagtc 

gggcaataggcatgaccttcaggtac 
ctagcgtttgggtatcggctgtactc 

gcaaacccatagccgacatgaggtac 
ctagcgaggatgaacacaaggaggtc 

gctcctacttgtgttcctccaggtac 
ctagccatagcggatagggttacgtc 

ggtatcgcctatcccaatgcaggtac 
ctagctctcatatccctaacctggcc 

gagagtatagggattggacegggtae 
ctagcggtattgtctcgtcatgcacc 

gccataacagagcagtacgtgggtac 
ctagcatcacctacctgagattgccc 

gtagtggatggactctaacggggtac 
ctagcgactggggtctgttgtttctc 

gctgaccccagacaacaaagaggtac 
ctagcaggggtcggagttagagaatc 

gtccccagcctcaatctcttaggtac 
ctagcaccgaaggatgttgagtctcc 

gtggcttcctacaactcagagggtac 
ctagcggtgatgactcggaacttctc 

gccactactgagccttgaagaggtac 
ctagcgggtacacaaatggaaggtcc 

gcccatgtgtttaccttccagggtac 
ctagcatcatctcgggtctacttcgc 

gtagtagagcccagatgaagcggtac 
ctagcaatcagtagacatcgccctcc 

gttagtcatctgtagcgggagggtac 
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X7 5 ctagccacggaagtctattcacacgc 

ggtgccttcagataagtgtgcggtac 
X7 6 ctagcaggggctgttacatgaagagc 

gtccccgacaatgtacttctcggtac 
X7 7 ctagccgtgacctgctgtgatttacc 

ggcactggacgacactaaatgggtac 
X78 ctagcatagtcttagtccggcatcgc 

gtatcagaatcaggccgtagcggtac 
X7 9 ctagcatcactacctgcctattcgcc 

gtagtgatggacggataagcgggtac 
X8 0 ctagcttaccttgtctaaccctcgcc 

gaatggaacagattgggagcgggtac 
X8 1 ctagctcttaatccct ctggctgacc 

gagaattagggagaccgactgggtac 
X82 ctagcagacccatccgagtaccatac 

gtctgggtaggctcatggtatggtac 
X8 3 ctagcagactgaaccctcaagatgcc 

gtctgacttgggagttctacgggtac 
X84 ctagcaggcattctgatactcctcgc 

gtccgtaagactatgaggagcggtac 
X8 5 ct agcctgcaagagt catatcacggc 

ggacgttctcagtatagtgccggtac 
X8 6 ctagcgtgggatttcgtctataccgc 

gcaccctaaagcagatatggcggtac 
X87 ctagccagaccgcctcctaatcttac 

ggtctggcggaggattagaatggtac 
X8 8 ctagcatctccttgcttgtacctcgc 

gtagaggaacgaacatggagcggtac 
X8 9 ctagctgtgcggctatgtagtcctac 

gacacgccgatacatcaggatggtac 
X90 ctagcagtgtcctaagcaaagacggc 

gtcacaggattcgtttctgccggtac 
X91 ctagcttcccttctaccctgtaccac 

gaagggaagatgggacatggtggtac 
X92 ctagcaaggtgtccagtggtcagatc 

gttccacaggtcaccagtctaggtac 
X93 ctagcatcagcatagggaccacactc 

gtagtcgtatccctggtgtgaggtac 
X94 ctagcggacggccaactatcataagc 

gcctgccggttgatagtattcggtac 
X95 ctagctgaacctctgtgctgtaggac 

gacttggagacacgacatcctggtac 
X96 ctagctgacttcacctggctgtctac 

gactgaagtggaccgacagatggtac 



19 

X97 ctagctcctaactctattgccgagcc 

gaggattgagataacggctcgggtac 
X98 ctagccttatcctacgcgagcagatc 

ggaataggatgcgctcgtctaggtac 
X99 ctagccgcataagcacatctgtagcc 

ggcgtattcgtgtagacatcgggtac 
X100 ctagcaacgtagagaacgaagccagc 

gttgcatctcttgcttcggtcggtac 
X101 ctagctcacggagtcctgatgtatgc 

gagtgcctcaggactacatacggtac 
X102 ctagcatgtatcccgttggtgtaggc 

gtacatagggcaaccacatccggtac 
X103 ctagcgcctgactattttgacggtcc 

gcggactgataaaactgccagggtac 
XI 04 ctagcgatagcacttcaaacgtcccc 

gctatcgtgaagtttgcagggggtac 
X105 ctagcgaagtgcattcttaccgtccc 

gcttcacgtaagaatggcaggggtac 
X106 ctagcccttgtcgttctaaactgggc 

gggaacagcaagatttgacccggtac 
X107 ctagcgtttaaggagtgttcgtccgc 

gcaaattcctcacaagcaggcggtac 
X108 ctagccagagttaaagttgccccacc 

ggtctcaatttcaacggggtgggtac 
X109 ctagcgtcagaatcaatctcctggcc 

gcagtcttagttagaggaccgggtac 
X110 ctagcgtgcttgacctgaatccttcc 

gcacgaactggacttaggaagggtac 
XI 11 ctagcaccacccttcttcaccctatc 

gtggtgggaagaagtgggataggtac 
X112 ctagccagaaagggcagtcctaatgc 

ggtctttcccgtcaggattacggtac 
XI 13 ctagcaacccactatcgtacatccgc 

gttgggtgatagcatgtaggcggtac 
X114 ctagcgaatacttgggagttgcgagc 

gcttatgaaccctcaacgctcggtac 
XI 15 ctagcggcaataagttcaggctgt cc 

gccgttattcaagtccgacagggtac 
X116 ctagcaagggagcagacagtcatgtc 

gttccctcgtctgtcagtacaggtac 
X117 ctagccataacacacgggacaacacc 

ggtattgtgtgccctgttgtgggtac 
XI 18 ctagccagaaacgaaggtcctaacgc 

ggtctttgcttccaggattgcggtac 
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X119 

X120 

X121 

X122 

X123 

X124 

X125 

X126 

X127 

X128 

X129 

X130 

X131 

X132 

X133 

X134 

X135 

X136 

X137 

X138 

X139 

X140 



ctagcgtgccaagccttaatagtgcc 

gcacggttcggaattatcacgggtac 
ctagccggatgaggatgtatcaggtc 

ggcctactcctacatagtccaggtac 
ctagccgtggcgttaaactcttaggc 

ggcaccgcaatttgagaatccggtac 
ctagcctgctcttgatggataaggcc 

ggacgagaactacctattccgggtac 
ctagcatcagagtggagagcacgatc 

gtagtctcacctctcgtgctaggtac 
ctagcgcacaggaaatagaagtcgcc 

gcgtgtcctttatcttcagcgggtac 
ctagcccacccgactacagaacaatc 

gggtgggctgatgtcttgttaggtac 
ctagccgagattacccagttgtggtc 

ggctct:aatgggtcaacaccaggtac 
ctagccgagttaatagacgcggaagc 

ggctcaattatctgcgccttcggtac 
ctagccgctgcggtttctctattacc 

ggcgacgccaaagagataatgggtac 
ctagctcagccgtaggtcttcaactc 

gagtcggcatccagaagttgaggtac 
ctagcttagtggagcctcttcacgtc 

gaatcacctcggagaagtgcaggtac 
ctagccagagtgtcggccttttaagc 

ggtctcacagccggaaaattcggtac 
ctagcctaaaacttgactcgcggacc 

ggattttgaactgagcgcctgggtac 
ctagcccgtctggtcggataagatac 

gggcagaccagcctattctatggtac 
ctagcagattggtcacaactccaggc 

gtctaaccagtgttgaggtccggtac 
ctagcaccagaaggttggagaaggtc 

gtggtcttccaacctcttccaggtac 
ctagccacatcttacaccgtcatcgc 

ggtgtagaatgtggcagtagcggtac 
ctagcgcgtcgggacttacaagatac 

gcgcagccctgaatgttctatggtac 
ctagcaggatggtgcaagcatactcc 

gtcctaccacgttcgtatgagggtac 
ctagcggacaaactgctggttcctac 

gcctgtttgacgaccaaggatggtac 
ctagcggttgcctcaaactggagtac 



gccaacggagtttgacctcatggtac 
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X141 ctagctttgtaagacaccccagagcc 

gaaacattctgtggggtctcgggtac 
XI 4 2 ctagcccctttggtcgttagtgctac 

ggggaaaccagcaatcacgatggtac 
X14 3 ctagcagagggctgagggatagaaac 

gtctcccgactccctatctttggtac 
X14 4 ctagcagttcgtacctttgacacgcc 

gtcaagcatggaaactgtgcgggtac 
X14 5 ctagcgatttgcagtgataccacccc 

gctaaacgtcactatggtgggggtac 
X14 6 ctagcgagtgtagaatccggctgaac 

gctcacatcttaggccgacttggtac 
X147 ctagcgcaagtccgtgttcctcttac 

gcgttcaggcacaaggagaatggtac 
X14 8 ctagcctttgtacggtggtgaaaccc 

ggaaacatgccaccactttggggtac 
X14 9 ctagcactttcatgtggagacgctcc 

gtgaaagtacacctctgcgagggtac 
X150 ctagcaacactaagcggatgtcagcc 

gttgtgattcgcctacagtcgggtac 
XI 51 ctagcctaactcaactccgcaacgtc 

ggattgagttgaggcgttgcaggtac 
X152 ctagcaagtagatggaaagggtgccc 

gttcatctacctttcccacggggtac 
X153 ctagcgaaatacgaggaacgagggtc 

gctttatgctccttgctcccaggtac 
XI 5 4 ctagcgaggctgctaagggaaaaagc 

gctccgacgattccctttttcggtac 
XI 5 5 ctagcggacgttatttcacacctcgc 

gcctgcaataaagtgtggagcggtac 
XI 5 6 ctagcgacacaagagcctaagccaac 

gctgtgttctcggattcggttggtac 
X157 ctagcgtgtttgtttaccctgccagc 

gcacaaacaaatgggacggtcggtac 
X158 ctagcactgcgtcctcacaaagaagc 

gtgacgcaggagtgtttcttcggtac 
X15 9 ctagcagtccccgaagtcaaatagcc 

gtcaggggcttcagtttatcgggtac 
XI 60 ctagcgatacttcccaatgcagacgc 

gctatgaagggttacgtctgcggtac 
XI 61 ctagcggaactaatgtcatgctgccc 

gccttgattacagtacgacggggtac 
XI 62 ctagcgcgatggtgataacgtaagcc 

gcgctaccactattgcattcgggta 
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X163 ctagcctcctgtacggcaaatgttcc 

ggaggacatgccgtttacaagggtac 
XI 64 ctagcctatttggaccgcacaagacc 

ggataaacctggcgtgttctgggtac 
XI 6 5 ctagcatccttgtagcggacacgtac 

gtaggaacatcgcctgtgcatggtac 
XI 6 6 ctagcaagtgggaatagggtccgtac 

gttcacccttatcccaggcatggtac 
XI 67 ctagcgatagaaccgactgcaatgcc 

gctatcttggctgacgttacgggtac 
XI 68 ctagcgcgaatgaagtggactatgcc 

gcgcttacttcacctgatacgggtac 
XI 6 9 ctagccgaactttagcatccgtgacc 

ggcttgaaatcgtaggcactgggtac 
X170 ctagcgttagaaatgtctgcggtcgc 

gcaatctttacagacgccagcggtac 
X171 ctagcgcaccacacgaaagaatctcc 

gcgtggtgtgctttcttagagggtac 
X172 ctagctcacactcttggtcggctatc 

gagtgtgagaaccagccgataggtac 
X17 3 ctagctcactctggccgaactgtatc 

gagtgagaccggcttgacataggtac 
X17 4 ctagctggagaacggtcttgtctgtc 

gacctcttgccagaacagacaggtac 
X175 ctagcaggtatcgcgtccctaatgtc 

gtccatagcgcagggattacaggtac 
X17 6 ctagctggctcactacacgaacatgc 

gaccgagtgatgtgcttgtacggtac 
X177 ctagcgtaccagtgaaatcgggacac 

gcatggtcactttagccctgtggtac 
X17 8 ctagccgaagaacaacatcagagccc 

ggcttcttgttgtagtctcggggtac 
X17 9 ctagcgggtccaatcctaaagcaagc 

gcccaggttaggatttcgttcggtac 
X180 ctagcttgagcagaagaactctcgcc 

gaactcgtcttcttgagagcgggtac 
X181 ctagcattgccgtggtgtaactacgc 

gtaacggcaccacattgatgcggtac 
X182 ctagccgcatacgttctgtgcagtac 

ggcgtatgcaagacacgtcatggtac 
X183 ctagccagcgatatggtggtgctatc 

ggtcgctataccaccacgataggtac 
XI 8 4 ctagccgtaacacccagagattggac 

ggcattgtgggtctctaacctggtac 
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XI 8 5 ctagctccacgataactcacaagggc 

gaggtgctattgagtgttcccggtac 
X18 6 ctagctcacttccctggtcgtaatgc 

gagtgaagggaccagcattacggtac 
X18 7 ctagcgcgttgtcacctaatacggac 

gcgcaacagtggattatgcctggtac 
X188 ctagcatgttacactcgcatccgacc 

gtacaatgtgagcgtaggctgggtac 
X18 9 ctagcatgtagccagtgattgtgccc 

gtacatcggtcactaacacggggtac 
X190 ctagcaagatgcagagtctaccgcac 

gttctacgtctcagatggcgtggtac 
X191 ctagcagaccgctctatcacaagcac 

gtctggcgagatagtgttcgtggtac 
XI 92 ctagcagggaggtccagaatcttcac 

gtccctccaggtcttagaagtggtac 
X193 ctagcttagaccagcactgcgaagtc 

gaatctggtcgtgacgcttcaggtac 
XI 94 ctagcccgaggcgtgactatacaaac 

gggctccgcactgatatgtttggtac 
XI 95 ctagcgacaacctacaaacgctccac 

gctgttggatgtttgcgaggtggtac 
X196 ctagccgtcattcgtcagaccagatc 

ggcagtaagcagtctggtctaggtac 
XI 97 ctagcagatgagtgacgatggttgcc 

gtctactcactgctaccaacgggtac 
XI 9 8 ctagcacccacaaccaggaaaagtcc 

gtgggtgttggtccttttcagggtac 
X199 ctagctgaggcatctttggagtacgc 

gactccgtagaaacctcatgcggtac 
X200 ctagctgtaatcggtctcaggcaagc 

gacattagccagagtccgttcggtac 
X201 ctagcctacagtcggattggctcaac 

ggatgtcagcctaaccgagttggtac 
X202 ctagccttcaaggacctcgtgcatac 

ggaagttcctggagcacgtatggtac 
X203 ctagcggcaggataaagtgctgacac 

gccgtcctatttcacgactgtggtac 
X204 ctagcgcctggacggttaaaggttac 

gcggacctgccaatttccaatggtac 
X205 ctagctgtccaagtaccaaagagcgc 

gacaggttcatggtttctcgcggtac 
X206 ctagctgcacactgggtctaacacac 

gacgtgtgacccagattgtgtggtac 
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X207 ctagcaatctgtcgcctgcatagtgc 

gttagacagcggacgtatcacggtac 
X208 ctagccccttcgctcttctttcatcc 

ggggaagcgagaagaaagtagggtac 
X209 ctagcaagatattctcccacgaccgc 

gttctataagagggtgctggcggtac 
X210 ctagcggtttctgcactacagccaac 

gccaaagacgtgatgtcggttggtac 
X211 ctagcaaaggcatcggacatcctacc 

gtttccgtagcctgtaggatgggtac 
X212 ctagcacgaagtatgcgttgtgagcc 

gtgcttcatacgcaacactcgggtac 
X213 ctagcggacatcggatttggataccc 

gcctgtagcctaaacctatggggtac 
X214 ctagccggtacttttgcgactctgac 

ggccatgaaaacgctgagactggtac 
X215 ctagcgttaccgctcaccgaaatctc 

gcaatggcgagtggctttagaggtac 
X216 ctagcaagaagacgatgtgtgctcgc 

gttcttctgctacacacgagcggtac 
X217 ctagcggctattcattctcgccctac 

gccgataagtaagagcgggatggtac 
X218 ctagcaagggaatgtcagttctgccc 

gttcccttacagtcaagacggggtac 
X219 ctagcggcttgttaggccgttaagac 

gccgaacaatccggcaattctggtac 
X220 ctagcaaggatagagatgccgttgcc 

gttcctatctctacggcaacgggtac 
X22 1 ctagcagcccgacatgacagatacac 

gtcgggctgtactgtctatgtggtac 
X222 ctagcttggagagactttgctgacgc 

gaacctctctgaaacgactgcggtac 
X223 ctagctgcgtgtacgggcttatagac 

gacgcacatgcccgaatatctggtac 
X224 ctagctcttattcgttagccgaggcc 

gagaataagcaatcggctccgggtac 
X22 5 ctagcagccgaacaggacaaaactcc 

gtcggcttgtcctgttttgagggtac 
X22 6 ctagccatcattaacttcccctggcc 

ggtagtaattgaaggggaccgggtac 
X227 ctagcccgaactaaatgtcacagcgc 

gggcttgatttacagtgtcgcggtac 
X228 ctagcgacgcaatcttaaccaacccc 

gctgcgttagaattggttgggggtac 
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X229 ctagctgtatgcacatcaccgagtgc 

gacatacgtgtagtggctcacggtac 
X230 ctagccgtctcattctggtcattgcc 

ggcagagtaagaccagtaacgggtac 
X231 ctagcaggttgggttcccctatcatc 

gtccaacccaaggggatagtaggtac 
X232 ctagccggttgcttatcaggaatccc 

ggccaacgaatagtccttaggggtac 
X233 ctagcacggccagtaagtgtccaatc 

gtgccggtcattcacaggttaggtac 
X234 ctagcacacctgggcgtgaataactc 

gtgtggacccgcacttattgaggtac 
X2 3 5 ctagcctt gggct cctgctaaaagac 

ggaacccgaggacgattttctggtac 
X23 6 ctagcgcgacctctaagccttgaaac 

gcgctggagattcggaactttggtac 
X237 ctagccaggtttgggggtacatgatc 

ggtccaaacccccatgtactaggtac 
X238 ctagctcgttgtctgccctcttg-tac 

gagcaacagacgggagaacatggtac 
X239 ctagcagacctcggtaaccacgaaac 

gtctggagccattggtgctttggtac 
X24 0 ctagcagtaaggtctttgggtgccac 

gtcattccagaaacccacggtggtac 
X241 ctagcagtagccttgtgggaaccaac 

gtcatcggaacacccttggttggtac 
X242 ctagcgattaacctctcggaatgcgc 

gctaattggagagccttacgcggtac 
X243 ctagcgcgatattgttccctgcttcc 

gcgctataacaagggacgaagggtac 
X24 4 ctagccaaccgtgttagcgaccatac 

ggttggcacaatcgctggtatggtac 
X24 5 ctagcgaatgggtggtggaaggatac 

gcttacccaccaccttcctatggtac 
X24 6 ctagcaccaagtcgctgtcaactgac 

gtggttcagcgacagttgactggtac 
X24 7 ctagccattttcaggaaggagacggc 

ggtaaaagtccttcctctgccggtac 
X248 ctagctggttggactgatacgaacgc 

gaccaacctgactatgcttgcggtac 
X24 9 ctagcattcacccaatggtctacggc 

gtaagtgggttaccagatgccggtac 
X250 ctagcggtggtaggcatatccgaaac 

gccaccatccgtataggctttggtac 
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X251 ctagctgtctgtgaccttgggattgc 

gacagacactggaaccctaacggtac 
X252 ctagccaagcctcaatgcctaaaggc 

ggttcggagttacggatttccggtac 
X253 ctagcacttcttacgtcaccgcgatc 

gtgaagaatgcagtggcgctaggtac 
X254 ctagcggttcctcaccattcggttac 

gccaaggagtggtaagccaatggtac 
X255 ctagcagagtcaatgtcggttggctc 

gtctcagttacagccaaccgaggtac 



